GUADEC III: Draining the Swamp

Looking for the way out?

Given at GUADEC III, Sevilla, April 2002. Covers:

This talk was given by Jim Gettys of Compaq (or I suppose it might be HP by now). It was about the great messy swamp we have created for ourselves, and what to do about it.

The size of the swamp

It all started when Jim's computing environment went stale. He had the opportunity to demonstrate the Linux desktop to high-ups at Compaq. And he wanted a nice new desktop to show them, since things are coming together very well at the moment. By the end of this year (2002), we shall have a range of very cool things: improved font rendering, 3D and freetype support in X; KDE 3.x; GNOME 2; good internationalisation; accessibility; Mozilla 1.x; cross-platform plug-ins; WINE; Evolution; and Xine. All of these will be (he thinks) stable and usable.

So off he went to upgrade his laptop. He tried at least four distributions, installing and configuring them. And it is not yet easy enough. We have a swamp. It comes from unforeseen consequences of old decisions, from good and bad ideas in the past, and even from personalities. We need fonts working out of the box. We need more multimedia support. We need online docs (and good ones). We have customisation problems. In the process of upgrading, he found:

General robustness problems
GNOME is not totally robust. Fill up your diskspace whilst editing your preferences and GNOME as a whole does not handle the situation gracefully. (how hard is it to check whether the write succeeded?) I confess it's nice to know that even luminaries like Jim can spend a day figuring out what exactly broke, since I take ages too :)
App-specific robustness problems
App-specific robustness is very variable. Bug-stomping is sorely needed.
Multimedia problems
Work on multimedia was underway in around 1992/1993, but then the UNIX desktop basically died. They were looking at issues like synchronisation then. We still need to finish multimedia support now.
Proxy configuration problems
Firewalls and proxies. Exactly how many times should you have to enter and update all your proxy details? Having one proxy at work and then taking your laptop home and needing to say Now use this proxy is common. So why do you have to update this for every app? Why can't you do it once and have all applications notice it?
Documentation problems
When UNIX was first shipped, it came with documentation in the form of man pages for every program and every configuration file, all cross-referenced. Then RMS wanted a format he could edit in EMACS, and along came texinfo and the info command. Now we have DocBook as well! Jim commented that things are less well cross-referenced than when they started, and that nothing addressed that. Any solutions which try to address this will have to address all these formats, and retrofit things onto the old formats too.
Font problems
Jim's slide for this began, First, there was X... :) Jim said that it was only recently, after twenty years of a new format every five years, that they have seen how to deal with this. Font servers have trouble when you have lots of desktops. You start it up, it sits there and has a think, and then sends an awful lot of data over the wire: and it does this for every size of font. There were mistakes with the naming: something to do with string-matching breaking (or being broken by?) wild-cards. Keith Packard now has ideas about this, but the problem is that it must all be retro-fitted. If it's not, then the swamp is not being drained: it's being made bigger with the addition of yet another way to do it. It's underway. It may even be done within the year.
Mimetype hell
You can (and may well need to) specify what to do with specific file types in /etc/mailcap, KDE, GNOME, Netscape, Mozilla and plugins. That's six, and there's probably more. And they're all optimised for their specific application or environment. They are designed for that app or environment. They lack information or features that the others need. This is not a big problem in the sense of how to fix it. The solution is to abstract it more, and keep information or options the other apps need, even if the app in question doesn't care. But for users it is a huge huge problem.
Poor defaults
When we said that X should define mechanism rather than policy we didn't mean that policy should be absent. We need good defaults everywhere so that everything just works when started for the first time. Providing a million configuration options is no substitute for making the default right. You shouldn't have to meddle. His favourite example is apparently the XF86Config file.
Upstream, downstream?
Upstream maintainers should not assume that the downstream vendors and shippers will fix stupid defaults. It's your software, so you are going to know more about it than the downstream people. Downstream shippers in their turn should tell the upstream maintainers about problems and fixes. The XFree86 defaults are still broken on some displays, because the people who found the problems never mentioned it upstream so upstream didn't know.
Customisation too specific
We have Xdefaults. We have GNOME preferences. We have KDE preferences. From the user's point of view, this is silly. We need to try to unify things. This is a lesson we should learn from Microsoft: centralisation is a good thing. (Until it breaks, at which stage you're sunk.) Any such customisation tools should understand networks, which many current ones blatently don't. Mobile computing is on its way and we need to be ready for it.

Draining the swamp

It wasn't so clear-cut as this in the talk, but the next part was more about things to be aware of in cleaning this mess up. One of the key themes was that pumping the swamp water further just to drain your local part is not an overall solution. For example, we need to consider likely consequences a lot more when adding things. What seems an improvement to the hacker adding it may not improve life for the typical user.

Also we need to consider libraries and dependencies. If GNOME comes up with a good solution that will work everywhere, but the solution requires all the GNOME libraries even if it's a server without X, it will not get used. It may be necessary to alter layers, split libraries (oh groan, more :)), or generally redo so that other projects can make use of it. If the design does not serve other projects, then you have probably not provided all the requisite functionality anyway. A good example is Keith Packard's work on fonts, where he had to stop and split Xft(1) up into Xft2 and fontconfig.

Microsoft wrote in the Halloween papers that one of the worst things about free software (well, they called it OSS, but..) was that we can cherry-pick all the best ideas. We should be doing that. But that means that we have to run other systems to see what's there to take. Apparently some people in the hacking room whinged at Keith Packard when they saw KDE as well as GNOME on his laptop (sigh). They shouldn't be whinging. They should be trying to to find out what it has that GNOME doesn't. We should be looking at Aqua, too; and we should be looking at Windows XP. In fact, given the huge size of Microsoft's market share, when it comes to setting defaults on applications, we should be using XP defaults unless there's a pressing reason not to. It will make new users feel at home.

Draining the swamp requires co-operation and compromise. If all the projects spend their time pumping their dirty water outside their little part, the swamp isn't actually getting drained. We need to work together. We may have to drop some of our things and use some of other peoples. (And so may they.)

Then it was time for the questions and there were lots.

Questions

So how did the laptop Linux-on-the-desktop demo that started all this go?
There were several, apparently. Some are still yet to occur. So don't know yet.
How do you work with others? It was suggested that KDE people were reluctant to attend this year's GUADEC because although lots happened in the interoperability BOFs, not a lot concrete came out of it.
Keith Packard answered from the audience that providing patches helped a lot. He spent a month writing a patch so that Mozilla could use the fontconfig stuff -- which they then didn't use, so this is not a good example! But at the least, he knows the Mozilla community better now. Murray Cumming thought that talking first is better than just sending patches. There was more discussion about this. (I personally (Telsa) don't agree that not a lot came out of the interoperability BOFs, but oh well.)
There are policy constraints which complicate overarching solutions. For example, XML catalogs would simply not be expected to live in /etc/xml/CATALOGS on Solaris. What do you do then?
As the rate of change and fixing (and thus swamp-draining) increases, there will be more pressure on people and vendors to conform to standards, or they'll be the only ones not doing so.
Library hacking is not sexy. An application writer doesn't want to have to spend a month messing with libraries.
From the audience, Bradley Kuhn (of the Free Software Foundation) noted that the FSF is trying to act as a sort of clearing point to get the right people in contact, so if you have a problem and need to get the library sorted, people should be able to ask the FSF to point them at the right people.

There were a lot more questions and answers I simply didn't hear well enough to write up. Some more notes from the question session, with comments that weren't really questions or answers:

References and links