How to Write Maintainable Code
Posted 15 Mar 2001 at 18:27 UTC by Bram 
Software engineers suffer from not knowing what their code should look
like. The classic essay worse is better
exemplifies this - How can worse
be better? Isn't worse worse? Even more confusingly, it's generally
referenced to claim the exact
opposite of what it's trying to argue.
The problem is that people use very different, and often antithetical,
criteria to judge the 'beauty' of code. Clearly there is a need for a
measure of code quality which is more objective than aesthetic.
I suggest that you judge code based on it's maintainability.
Truly maintanable code is flexible and can be taken in many directions.
Code is not more maintainable just because it has more features -
invoking functionality which is currently dormant is not maintenance,
it's use. Maintenance is when you add new functionality or change
existing functionality. This is often done long after the code was
originally written and in a completely unforseeable manner.
Planning for the unexpected is a paradoxical concept - if you don't know
what it is, how can you plan for it? Thankfully, there are many concrete
techniques which work.
Use less code
The less code you have, the less there is to maintain. You shouldn't
slavishly count characters or lines of code in your program and reduce
it at all costs, but generally speaking, less is better. Get rid of
unused functions and diagnostic statements - they're just more muck to
wade through.
Using existing libraries, of course, results in less code. In practice,
many libraries so low quality that you're better off rewriting them, but
many aren't, and you should use those when possible.
Encapsulate
If you create well-defined boundaries which are only connected by narrow
bridges, you can rearrange entire towns with the neigbors being none the
wiser. If there's too much travel going on, even a single eviction can
cause widespread panic.
Reduce preconditions
Code which is very persnickity about how it's invoked makes everything
else harder to maintain. Examples are requiring that methods be called
in a certain order, and requiring that a certain method be called only
once. Try to avoid those.
Write in an easy to maintain language
My favorite language for maintainability is Python. It has simple, clean
syntax, object encapsulation, good library support, and optional named
parameters. An example of a language which is terrible for
maintainability is Perl. Yes, I said it. No, I'm not going to back down.
Write test code
New code always runs the risk of breaking something. A test suite which
is easy to run and either returns 'everything passed' or 'these tests
failed' makes it very easy to detect and fix regressions.
Test code has to be changed much less often when it tests interfaces
rather than implementations. For example, code which turns objects into
strings and back again (known as 'pickling') can be tested by pickling
and unpickling an object and comparing the result with the original.
Those tests will continue to work even if the string format is
completely changed.
Create tools
There are two ways of building a barn - one is to make a hammer and use
it to nail the barn together, the other is to nail it together with your
hands. They might take about the same amount of time, but the hammer
will help you again in the future.
Use safe techniques
There are several techniques which result in more maintainable code
under almost all circumstances. Garbage collection magically removes all
the headaches of memory allocation. Monothreading gets rid of all the
headaches of thread safety. And don't forget the first rule of writing
internet applications - 'Don't re-implement TCP/IP'.
Let yourself get frustrated
Many times I have gotten frustrated doing something the 'right' way,
figured 'fuck it', and done something simpler and more expedient.
Sometimes it turns out to be a hack, but often it turns out to actually
be a more flexible solution, for the very reasons I got frustrated by.
A word about performance
There's a rule of thumb that 1% of the code takes up x% of the actual
runtime, and in recent years x has been increasing dramatically.
Combined with how much cheaper fast machines are than developer time,
this means that performance is much less of an issue than it used to be.
Sometimes it makes sense to work on better performance, but it should be
viewed as a feature like any other, not an overriding principle to build
software around. Maintainability works much better.
-Bram Cohen
Someone was going to respond to that comment about Perl, so I'll get
in first. :-)
Perl is a contradiction. It's a less maintainable language than,
say, Python in that there's more of it to know and more of it to
understand. So although you can write Fortran in any language, it's a
little easier in Perl than Python.
However, Perl has CPAN. With CPAN, you have complexity out of a
can, enabling you to write even smaller programs.
So while I won't ask you to retract your statement, I will merely
note here that both Python and Perl have their own strengths in writing
maintainable code.
Having said that, I'd add "refactor mercilessly" to your list. That
is, aggresively look for opportunities for improving the design of your
code without rewriting from scratch.
I agree with most of this, but one particular sentence caught my eye:
"Get rid of ... diagnostic statements"
I have to say I thoroughly disagree with this point;
esr said it better than I ever could:
"[D]ebugging options should not be
minimal afterthoughts. Rather, they should be designed in from the
beginning, from the point of view that the program should be able to
demonstrate its own correctness and communicate the original developer's
mental model of the problem it solves to future
developers."
As for your word about performance, it is true that a small proportion
of the code makes for a large proportion of the runtime, but I'd like to
expand this with "...and the 1% that you think is taking all
the time may well not be the 1% that you expected to."
I never bother optimising things for speed during the coding cycle, but
when I consider a block of code to be mature I tend to run it through a
profiler. Occasionally I find that some really trivial function hasn't
scaled well and gain a 1000% speedup for the cost of five minutes coding
time.
Speaking of Perl..., posted 16 Mar 2001 at 15:38 UTC by dlc »
(Apprentice)
Even though Pseudonym already put in a defense of
Perl, I want to throw in my 2 cents, in the form of a few quotes:
It's not that perl programmers are idiots, it's that the language
rewards idiotic behavior in a way that no other language or tool has
ever done.
-- Erik Naggum
and
...if we judge something by how badly it is misused, well, hell would
be perl, right?
-- dancer
I would like to slightly modify Erik Neggum's quote to say "accepts",
or possibly "overcompensates for", rather than "rewards", but the
sentiment remains essentially unchanged.
Perl has the ability to look like anything; this can be seen
as an advantage or disadvantage, but assuming that this fact make Perl
an unmaintainable language sounds like amateurism to me (no offense mean
to Bram). Good
programmers write readable, maintainable code, regardless of the
language. Languages are tools, and any tool can be used for good or ill,
depending on who is wielding it.
That having been said, I would agree that there is more bad code
written in Perl than in a lot of other languages, at least in my
experience.
(dlc)
The referenced article is pretty neat., thanks for the reference.
An important principle for writing code that others can mantain is sometimes called locality of reference.
What I mean by this is that that you should be able to look at any single function, or small unit of code, and understand
it sufficiently to make small changes, such as fixing bugs, without looking elsewhere.
If your code is only ever lovingly mantained by a closed team of people, you can ask more of those people. You can ask
them to learn about your project's coding conventions, to learn how you have yuor very own way of memory management, or
how your classess overload operators . Whenever you add a feature that makes people learn something before they read your
code correctly, you make your code harder to mantain: you raise the barrier for entry.
It's convenient, when writing textbooks, to assume people will read every chapter in order from start to finish. You can build
up examples a piece at a time. If you do this, your book is useless as a reference, and you force people to read the chapters on
the topics they already know. So in practice they won't read your book.
I favour readable code above correct code, because in fact code never stays correct, and unreadable incorrect
code is a nightmare.
Use less code
Have a look at applications provided in the Pliant tarball, such as the
database engine; select your favorite language, and try to
implement.
Good luck (the remaining says why you would get serious problems).
Write in an easy to maintain language
Here you say that you like Python clean syntax, but do you know that
Pliant one is as clean but much more flexible (can either look
instructions centric or Lisp like), and can even be customized to better
match various
needs.
Create tools
Well, with Pliant you have a built in code generator, just like in Lisp,
and that's
what would prevent you to translate Pliant core applications to
Python.
A built in Lisp like code generator (the program is an object you can
compute on) in a dynamic compiler (you can make any complex computation
at compile time) is what you need, and lack, most time to build
efficient tool.
A word about performance
Pliant is just as fast as C, so you won't have to switch to another
language, with the extra inferfacing problems, to get full speed in
speed critical
areas.
With Pliant, you can adjust the programming level in each function, with
infinit set of potencial levels. It's not only a rough binary matter of
writting
in the high level language or switching to the dirty low level language.
They are plenty of things in the middle that better match most cases,
but since with Pliant you can get all these within a single syntax,
and a single programming language,
you can truely select the right level for each part of your code.
retrospective, posted 17 Mar 2001 at 11:21 UTC by mbp »
(Master)
These are the qualities of software.
In my experience, they are hard things to aim for. Rather, they are
the attributes that give you a sense of quality when you finish
construction.
How much abstraction is too much? Too little, and the code is a
tightly-coupled mass and hard to modify. Too hard, and the code is over
large, hard to understand, and possibly still will not allow for the
changes you want, but only those you thought you would want. You can
only know the right amount once you already have it, changes come
naturally, and there is no waste. Possibly you can
get a feeling for the right amount through experience.
As Ritchie is supposed to have said
Buy low and sell high and everything will work
out.
Slap!
sure, and so people should dig a little in the code rather than blindly
selecting an application.
The assertion of an application beeing released under GPL is enough to
grant high quality on the long run is completely false; most or all the
criterians listed in this area should be checked for, but are not in
facts.
When providing Pliant implementation, I try to get rid of all most no
use part of the code, so for a given protocol or feature, I often get a
more than 10 times (often rather 100 times) fold compared to the
mainstream free implementation, and I restarted most of them two or
three time from scratch just because it's fairly common to understand
how the application should have been started ... when it's finished.
One has to be couragous at some point and say:
ok, now I understand what I should have done, so let's do it, and
restart. This is very important attitude if we expect to be truely abble
to
share code at some point (what is not yet true since free software is
mainly
a matter of puting code side by side at the moment), but it's not yet a
common practice.
On the other hand, we notice that most users, even when highly praising
for free softwares completely deny the maintainability issues listed
here when selecting a software, pretending that free is enough so they
don't have to think more, just like the mainstream pretend that
availability for the mainstream plateform is enough, pretending that
evolution will also be smooth for them, without the need of thinking
and selecting wiser.
It's a long way to go before we get wise users, because it's a matter of
culture, just as it took more than a hundread years for people to learn
democraty.
Please, keep talking and explaining.
Getting better maintainability according to your criterians is mostly a
matter
of user understanding them and requesting for them. Let me explain:
There is room in the mainstream for free softwares today because
advanced organizations
understood that they need smoother evolution on the long run, and that
free softwares was a good criterian to get it.
As a result, big names such as IBM now start to apply these new
development
rules.
The same apply for your maintainability criterians: when users will
understand
that selecting softwares according to these criterians is a good way to
reduce
costs on the long run, then developpers will
start to also take care of these. Not any sooner.
Once again, all the power is in the user hand at selecting time.
closed software = reduction to slavery,
huge software = high cost.
Some Followups, posted 17 Mar 2001 at 19:25 UTC by Bram »
(Master)
There is a very real difference of opinion (one might say cultural rift)
among competent programmers about Perl vs. Python. I figured, correctly,
that stating my opinion on Advogato would result in
polite discussion of the issue, much nicer than what generally happens
on slashdot.
On diagnostic statements - what I said was
"Get rid of unused ... diagnostic
statements"
It is, of course, vitally important to let users know what's happening
in the system, and diagnostic statements are a great way to do that, but
ones which you wrote to track down a specific bug and have never read
the output of since should be removed.
I worded that sentence awkwardly. (((English (should
be)) (distributed as))(syntax trees))
Ankh's point that code should be readable in
isolation is a good one. It belongs on the list of basic techniques.
I have written these guidelines down explicitly in the hope that they
can be of practical use writing software, in the same way that noticing
the passive voice helps writing english. While I agree with mbp that
this is difficult to do, I believe it's a skill which can be
learned, and is well
worth learning.
A habit I've found find nice is to make diagnostic statements verbose
enough to be code comments in their own right so that they aren't
useless. Then when finished using them,
simply comment them out and they remain useful comments in the code that
can easily be re-enabled as a diagnostic statment should the need ever
arise.
For instance writing diagnostic statement meaningful only to you while
developing and debugging at the moment such as:
debug.write("NARF!") # do you know what I'm thinking Pinky?
is not useful long term, while something like this is even after being
commented out:
debug.write("Finished scheduling %d foo requests, awaiting results." %
num_requests)
I've spent eons in APIs, libraries and code alike - scratching
my head and above all wandering, forever wandering. I feel
really stupid. I am really stupid.
When I comes to write code I have no problem remembering that I
am stupid.
I space the code out, lots of line breaks to mark
functional sections in methods. To show I'm a throwback I indent
it with two spaces and keep it all to 78 columns. I comment it
thoroughly and just prettily enough to engage both minimalist
and artistic microtubules. I also try to comment out my obvious
debug messages (yes, splork).
All pieces of brilliance that I have stolen from someone else
I try to explain in extended (but dumbed down) comments.
However - above all of that - I just hope I remember what I was
trying to do when I come back to it.
Perl, posted 22 Mar 2001 at 15:16 UTC by pudge »
(Master)
Perl is more maintainable than Python. This is fact. How can I back it up? Because I write
very good, clean, maintainable code in Perl, and I do it quickly and easily, and I don't know
Python. Oh, I guess I should have said it is more maintainable for me.
Point? The statement "An example of a language which is terrible for maintainability is
Perl" is silly, at best. A language which is "maintainable" is one which can be maintained well.
Since many people can and do maintain Perl code well, your statement is therefore false. For
you -- and perhaps for most people -- Perl is less maintinable than Python. For others, this
isn't the case. To make blanket, seemingly (and incorrectly) objective statements about which
is more maintainable is counterproductive to your mission, which is apparently to get people to
write better code. Perhaps a better wording would be:
"Perl is a harder language to master than Python, and for most people, it will be harder to
write maintainable code in it."
Now you're closer to being objective and honest. By that, do I mean your previous
statement lacked objectivity and honesty? Yes. I don't say it to attack you; I have no interest
in attacking anyone. But if you want to convince people to write maintainable code, you can't
make backhanded, silly, and transparently false remarks to them.
If this post strikes you as impolite, like something you might find on Slashdot, I apologize.
But since I write Perl for Slashdot, what would you expect? ;-)
Missed point, posted 24 Mar 2001 at 17:42 UTC by mvw »
(Journeyer)
Sorry, but the cited essay of the rise of "Worse is Better" was the
somewhat sad
look-behind of a protagonist of the LISP community in face of the rising
UNIX/C crowd.
The LISP folks wondered why their more powerful language and more
powerful
development environments lost to that bloody "PDP-11 assembler that
thinks
it is a language" users.
It was the struggle between perfect and practical.
I fail to see the connection to the rest of your posting.
Re: Missed point, posted 26 Mar 2001 at 02:19 UTC by Bram »
(Master)
My favorite telling passage from worse
is better is the following, in the section on how Lisp could be
'improved':
Scheme-continuations remain an ugly stain on the otherwise clean
manuscript of Scheme.
You hear that kids? This whole exception handling thing is just a fad,
it's an 'ugly stain'.
This is only the most telling example. In almost every case, the cited
ways in which Lisp is 'better' are based on abstract beauty, and in
almost every case they're wrong. Unless your notion of beauty has some
rational basis to it, you're always going to be flying blind.
I'm suggesting you base your notion of code beauty on maintainability.
Continuations, posted 26 Mar 2001 at 16:10 UTC by ingvar »
(Master)
Er? Continuations are exceptions? Eeeeeh. Well. [actual example further
on]
AFAIK (not being a scheme programmer), continuations are a beauty that
leads to potentially very unmaintanable code and GC problems.
Normal exception handling is usually done either with an error-catcher
that you "throw" to (and, if a continuable error, gets called) or by
wrapping YourStuff in an UNWIND-PROTECT, to do resource clean-up.
Meanwhile, continuations can do NiftyStuff I would not trying to use
exceptions for. Imagine, if you wish, a recursive decent parser that,
when it has a full parse, pushes a continuation on a global list, then
returns a value that indicates "no parse found" and continues until
all parses have been found.
Now you have a list of continuations you can return from, so you start
doing that, each of those returning "parse found" and a parse tree is
built up during the return-to-top, where a parse-return is detected and
that parse is stuffed in a list of "full parse trees".
Now, all this is probably quite doable in other ways, but not in such a
"simple" fashion. But understanding and maintaining the code is
horrible (not to mention it *will* keep an UNWIND-PROTECT hanging
until all continuations depending on the first clause have been
returned from and garbed.
Re: Continuations, posted 26 Mar 2001 at 21:15 UTC by Bram »
(Master)
Exceptions validate the concept behind continuations - that more
advanced forms of control flow are needed.
Continuations are, of course, much more powerful than exceptions, but
have a laser sight pointed directly at your foot. I personally think in
terms of a form of continuations which don't allow them to be passed
around as objects but do subsume return, while, continue, break, and
raise (finally is a weird special thing).
Your recursive descent parser is a great example of both the power of
continuations and how dangerous they are to use. While I wouldn't use
continuations in that case, I do use exceptions in a very similar one -
When performing an exhaustive search, if a solution is found, I raise an
exception containing the solution.
Well, yes, but often when you do NL parsing (perhaps one should use a
more powerful parser than a recursive descent one, but...) you want
all syntactic parses, so you can choose the one making the most
semantic sense.
I found I wanted continuations when I tried to modify a recursive
descent parser I made as a lab project, back at uni, and found it
*quite* hard to do, without converting it all to scheme and use
continuations. I am sure it's doable and I have several models, but none
have the elegance of the continuation-based version.
And therein lies the problem with continuations, where they're usable,
they are *very* usable. For exception-like behaviour, a CATCH/THROW is
(as far as I can see) Good Enough, since you're guaranteed it can only
be passed down the call stack, not up.
We often think of code as a Computer-Human interface. That is correct.
The code translates human statements in a formal language to machine
code in the case of a simple compiled language. In general, we could say
that the human statements are run on the host machine. (That
C-statements are run on a virtual C-machine)
However, thinking of the code as a communication device between
developers, the whole matter takes on a different look. Now, the
expressiveness of the language matters to another human while the
machine does not even care and you wouldn't mind as long as the
semantics remain the same. That is why we march along the road of
abstraction, one day the programs will have to be even more expressive
and we will find ways to surpass oo langs in that regard. But for
machines, little will differ. That is, unless they do not communicate
with us in a complex language (which would mean some great obstacles in
AI have
been overcome)
In this view, the maintainability of the code is important because we
now concentrate on how readable and writable the code is rather than how
efficient or beautiful it is. This means if you are dealing with linear
algebra, it would be better that you are reading something like:
y = A * transpose(B) * C * x ;
You want to communicate this idea through code to other developers, and
the way to do it is to write in more abstract ways, code that promotes
code re-use, code that is comprehensible. We should pay heed to
objectives like modularity and encapsulation as the author suggests. We
must also give great importance to important programming practices like
refactoring.
Taking old code and giving it a shiny new look by doing small
rearrangements is a very powerful idea.
Just my two cents,
I think Bram missed the point of rpg's criticism of
Scheme continuations in Worse is Better. At the time exception
handling had been a part of Lisp for many, many years, first in the form
of catch/throw and later in the condition system that's part of Common
Lisp. rpg certainly knew the value of full featured exception
handling. From the point of view of "Worse is Better" implementing
continuations is the wrong decision: it gives power and convenience to
the user at the expense of implementor. Because implementation is
harder, it's less likely that Scheme will spread like a virus that is,
like Unix and C.
This is discussed at length amongst other things in taoup.