Projects I don't have time for

I think up all sorts of great projects that I don't have time to work on. It's a shame to just let the ideas rot, so I'm transcribing them here. If one of these sounds interesting and you have some time, feel free to drop me a note. Also, if you know of something that does something like what I describe here, I'd love to hear it.

Useful archived mailing lists
As Usenet increasingly becomes a mass medium I'm more interested in using email lists for tightly focussed discussion groups. One nice thing about a mailing list is the technology is really simple but with a bit of work you can set up quite a nice service. For the average subscriber to a mailing list, some combination of web interfaces for subscriptions and mailing list archives and procmail to handle incoming mail can be a nice way to participate in a small community. Or if the list is available via NNTP (a very handy thing), read it as a newsgroup with Gnus or your favourite news: capable Web browser.

Mailing list providers have to do a bit more. The list itself should be administrated via majordomo. It has its problems, but I don't know of anything better. Mailserv can be used to give a WWW interface to subscriptions, and either hypermail or MHonArc will handle archiving the list on the Web. If you want to provide an NNTP service for reading the list as well (no need to export it to the world, just locally), then mail2news and INN are a good combination. Finally, you'll want an efficient mail delivery backend to handle all the thousands of people who subscribe to your list: qmail is looking to be a good sendmail replacement, and bulk mailer might be a lightweight mail distributer.

All this software exists, but knowing how to use it is a bit arcane and the actual setup is a lot of work. A good project would be to document it all and somehow bundle the tools together to make it easy to install. Some of these programs could probably use improvement, too.

Real-time TCP/IP library
Ever since netrek switched to UDP, realtime Internet communication of nontrivial information has been possible. Netrek is a tricky thing: you've got 16 people talking at least 1k/second to a central server, and packets have to get in and out reliably and quickly or else no one can artfully dodge torps. Bandwidth has to be fairly high, and roundtrip latency over 250ms is really not acceptable. It's amazing how well it works. There are few other realtime Internet games now, although none seem to work as well as netrek. And there are some interesting non-game realtime net applications, too: transmission of sound and video are becoming a major factor.

It seems like there's a lot of poorly understood art to getting a certain amount of bandwidth through the net subject to bandwidth and latency constraints. I don't really understand the technical issues well enough to write my own realtime net app. It might be interesting to learn, though, with an eye towards creating a library. There's a good question if a general purpose library is even possible: the tradeoffs for video might be quite different than, say, a game. What I want my library to do for me is to guarantee a ceratin amount of reliability, bandwidth, and latency, and allow me to tell it prioritize traffic if I need to exceed those guarantees.

automailcrypt
mailcrypt is pretty cool: a really nice interface to PGP for emacs. Using it, I can sign and encrypt outgoing mail and news and decrypt and verify incoming mail and news. The problem is I still have to push buttons - I have to remember to encrypt, I have to press extra buttons to decrypted mail. At a minimum,automailcrypt should automatically encrypt all outgoing mail if I have a key for the recipients, and should check all signatures. The other two functions (signing outgoing, decrypting incoming) require passphrases, so they have to be handled more carefully. Some override mechanism for all the automatic functions is needed. There was another mailcrypt add-on (whose reference I've lost) that did something like this, but it did too much, was too complicated.

Linux memory usage monitor
One of the neat things about Linux is that it uses all available memory for disk cache ("free memory is wasted memory"). This means that more RAM is always useful - as such, I've learned to be careful, switching from xterm to rxvt and not having a fancy background screen. It makes a difference.

One thing that would be interesting and useful is to have a little tool that displays the usage of all RAM dynamically by process. See how much RAM each process is consuming, how much swap, both on its own and shared across several properties. There are various sources of this information: /proc/meminfo, ps -axm, and /proc/self/maps. These data sources are somewhat confusing and undocumented, so it will take some doing to get the needed data. It's also unclear exactly what we want to display.

Quickcam support for Unix
Connectix sells a really nifty little CCD based camera called a "Quickcam" for $100. Unfortunately, they only have software that runs under Windows or Macs. But, a group of intrepid hackers has done a nice job of figuring out how to drive the camera (mostly without specs!), and you can use the thing under Linux quite effectively. I've even written a document on using CU-SeeMe with the Quickcam that describes all the hacks needed. The next big step is to make a standard kernel interface, shake out the last remaining problems with driving the camera. It's been fun working on this thing.

html-helper-mode 3.0
My html-helper-mode beta has been out and relatively stable for a long time. I need to clean up some details, write new docs, and get the damned thing out. I'm actually going to do this.

Small Web Browser
Netscape drives me nuts. Here I have this 4 meg monolith program that does way more than I want, I don't have the source code, it's not extensible, and it's buggy. All I want is a simple program that speaks http, nntp, gopher, and ftp protocols, and interprets HTML, images, and maybe Java. I don't want a mail reader or newsreader, I don't want Motif bloat.

I do want to be able to extend the thing, though. There are some simple things I want - quick lists of all the hrefs and <h#>s in a document, for instance. In general, I want the ability to make by browser help me read the Web better. But of course, Netscape is a huge piece of code and writing a new web browser, even a small one, would be a nightmare. Maybe GROW will save us all. And NCSA Mosaic is actually pretty slick these days, but its network loading code is still single threaded. If that were fixed, I'd probably use it.

IRC
IRC, a realtime chat system on the Internet, has been around for about five years. It's suprisingly popular: when I started four years ago, there were maybe 1200 people online. Now on the main EFNet network there are regularly 15,000, Undernet has another 5000, and there are probably another 5000 scattered around other networks. Like many Internet services, IRC is collapsing under its own weight. EFNet is a mess - server splits are common in the evening, lag is bad, it's getting hard to have a conversation.

IRC is designed as a centralized, top-down network of servers. The servers pass traffic around trying to stay synchronized; individual clients connect to servers and listen in. The problem is that the server network is too brittle: servers get out of sync, get lagged or drop out entirely, and the whole net falters. On top of that, EFNet's server operators generally don't seem to understand how to make the network work right. Back in the old days, at least, half the servers were run by vanity ops who wanted to be cool but didn't understand what was going on.

Undernet tries to address the IRC problems by improving the quality of the servers. The protocols have been improved a bit, and the control of the network is even more centralized - a small committee coordinates teh network. I haven't spent a lot of time on Undernet (the people I want to talk to aren't there), but it generally looks pretty good. No telling how well they'd handle the same load EFNet has, though.

I believe that at some point, any centralized network like IRC will collapse under it's own weight. It would be interesting to try to apply some of the principles of bottom-up systems to redo IRC. One thought people have kicked around for a couple of years is to go serverless, have channels served by a floating system of servers that spontaneously form to handle traffic as need be. This could go anywhere from no server software at all, clients passing around messages, to something much like the current design but more uncoupled.

In order to do this redesign effectively, a lot of thought needs to go in. Simulating how proposed designes responds to heavy traffic would be a big help. But as with all projects like this, the hardest part is finding the time to implement it and convince the world to use it.

Grand Usenet redesign
Usenet has a few serious problems: nothing fatal, and Usenet always survives problems, but these should be fixed, especially if Usenet is going to grow. The hard part isn't so much designing improvements, it's implementing them and getting them accepted. The last big Usenet revolution was INN, but that only happened because INN was a major improvement and was almost a drop-in replacement for Cnews.

The main thing I want to see is to stop having Usenet be a store-and-forward network, move more towards servers giving out articles, but with sophisticated caching schemes to reduce load. For heavily trafficked groups I'd expect tens of thousands of copies of the same article to be all over. This isn't much different in effect than the current scheme. But for less read groups less copies would be needed, and so there's less overhead in keeping them. If done right, you could have a fairly sophisticated adaptive distribution framework trading off disk space, server CPU usage, and network bandwidth against readership demand. Doing this would marry the Usenet to the Internet forever, and that has unfortunate political consequences, but the fact is Usenet as it exists now couldn't be without the Internet.

Some other things I'd like to see on Usenet... It'd be terrific if all Usenet articles were permanently archived in some URL-addressable format. Or better yet, use URNs - just specify the message ID and rely on a URN resolution framework to figure out where that article is going to be found. There are some nice ad hoc ways to do this now, if nothing else just count on DejaNews to do the resolution.

alt.* needs to be integrated into the mainstream Usenet in some fashion: alt is currently something like 70% of the net traffic; the anarchical upstart has dwarfed the respectable parent. There are good groups inside of alt (alt.security.pgp, for instance), they need to be taken out of the noise. Along with mainstreaming the useful stuff in alt.*, people need to stop distributing binaries, especially pictures, on Usenet. It's ludicrous that the same 200k picture of someone naked is stored on 50,000 computers, has been transmitted across to them, only to be deleted two days later.

gcc Objective C compiler/runtime interface
gcc, in addition to supporting C and C++, also supports Objective C, a simple and very nice object oriented extension to C (used in my project Swarm, among other things.) The problem is that the gcc implementation of Objective C has a couple of wrinkles: the message call code is not portable to AIX, and the code is almost there to support metaobject protocol magic, but not quite.

One choice would be to hack gcc so that the runtime did what I want. That'd be ok, but it would be even better to design the compiler so that the interface it requires to the runtime is well designed and documented, so people could write their own runtime libraries. I know several people interested in doing this, but no one has the time to sit down and really do it.

pgp in kernel space
I've been working some with PGP, the public key crypto package. Nice stuff, but it's severely compromised by running it on a multiuser system. Running the most sensitive aspects of PGP (in particular, secret key management) in kernel space could make things more secure. Linux makes it possible to augment the kernel with this sort of function, but it's hard to design and notoriously difficult to implement correctly. Linux also now has the option to mark some of user memory as special, not to be swapped to disk. Making PGP at least use that to protect your key from being swapped would be a good thing.

Newsreader
Switching to Gnus v5 got me thinking about newsreaders. I've been talking with a couple of other folks about what a great newsreader would look like. Some features we'd like:

MOD file formats
Way back in the dark ages the Amiga brought computer music to the masses with the MOD file format, a sort of primitive sequencing format that produced music good enough for demos and games. MOD still exists today, with software on Unix, DOS, Macs, etc. MOD files have been augmented to allow lots of nifty effects (stereo pan, fades, etc), more samples, etc. But the format is still ridiculous: binary, so you can't read it or easily write code to manipulate it. It has stupid restrictions: some only allow 64k samples, or you can only have 64 samples, or all sequences are exactly 64 notes long. All of the software I've ever seen is heavily hacked in assembler.

What's needed is a sequence format that's human readable, with no built-in arbitrary limitations. I suspect it would be useful to have minimal control structures built in, to allow loops easily and (maybe) some nondeterministic playback. Don't want it to get too complicated, though. This format might exist already, I don't know: maybe MIDI formats are good enough. I definitely know I don't want to take 3 months to learn how to write a player that will interpret my cool format, especially with the knowledge that it's unlikely anyone else would ever use it.

xbattle 6.0
xbattle is a really nice abstract game, a good combination of strategy, tactics, and speed. Work is currently underway for a client/server version of the game that, if done correctly, could make xbattle playable over the Internet, maybe expand the options available for types of games. Alas, I just don't have time to help out much.

Projects I wanted to do, and were already done

It's such a joy when you have a great idea, no time to do it, and then yuo find someone else did it better than you would have.

web of trust statistics
pgp key security is based on the notion of "the web of trust": people sign each other's keys, and with enough cross-signing you can verify any key via a chain of signatures. I'm curious how well this has worked so far. The PGP keyserver has over 16000 keys on it: that's a pretty big dataset for the web of trust.

Considered as a giant directed graph (nodes are keys, edges are signatures), how well connected is the web of trust? My guess is not very - I'd expect to find some 5000 or so disconnected components, and the largest connected component being about 100 people. But I'm not sure, and the data is there, it's just a matter of collecting it.

And as of Feb 4 1996, someone had done it! Neal McBurnett's Analysis of PGP Keyserver Statistics is pretty much what I had in mind. My guess was pretty far off. There's one big connected component of 2775 keys, and then a bunch of little tiny components that don't amount to much. Lots of other interesting data, too.

Cryptographic login authentication
Typing passwords over the Internet is stupid. It's majorly insecure, not to mention inconvenient. The algorithms and protocols exist to make secure authentication over the Internet possibile: it's just a matter of coding them up. Again, hard to implement right.

Fortunately, ssh has done this for me, and much better than I could ever have done it. The one (minor) drawback is I can't use my PGP key for authentication - I have to use some other RSA key that ssh has generated. I can live with that.

X11 reconnecting proxy server
X was somewhat revolutionary in allowing network graphics: your client can run on one machine, the X server in another, and via TCP/IP the client displays on your server. But the notion of "server" is too rigid for my tastes - a server is a monolithic process that has to be attached to a physical screen. If someone would just write a proxy server that was intended to not always be connected to a real server, it would be fantastic. The idea here is that you run your proxy server all the time, tell your clients to display there. Then when you sit down at a computer, you connect that machine's server to your persistent proxy and it forwards the display to you. When you're done with the machine you're sitting in front of, shut down the local server and the proxy server sticks around to keep your clients happily running. xmove seems to do this. Should be handy to work around programs that demand to be connected to an X server to run.


Nelson Minar <nelson@santafe.edu>
Last modified: Sun Dec 15 15:32:13 EST 1996