Journal 2004/03/31

Ant and XML

My friends Mike Clark and Glenn Vanderburg asked to me to write up my current thoughts on using XML as the build file format for Ant based on watching it being used by Java developers over the years. It's very educational, as well as very sobering, to have software ideas that you put into form used by tens of thousands of programmers every day. Let me just say that you learn a lot from the process.

A version of this document (probably edited for size) will appear in Mike’s next book, Pragmatic Project Automation which is part of the Pragmatic Starter Kit.

The first version of Ant didn't have all the angle brackets that you sprinkled see all over its build files. Instead, it used a properties file and the java.util.Properties class to define what tasks should be executed for a product. This worked really well for small projects, but started breaking down rapidly as projects grew.

The reason it broke down was the way that Ant views the world: A project is a collection of targets. A target is a collection of tasks. Each task has a set of properties. This is obviously a hierarchical tree. However, property files only give you a flat name=key mapping which doesn't fit this tree structure at all.

In the first version of Ant (which didn’t see a release outside of my group at Sun), I faked a tree structure by using property names with a dot notation—similar to domain names and Java package names. From memory, a rough approximation of these early build files is:


When there were 10 or 20 keys in the build file, this wasn't so bad. But as projects grew in complexity, editing these files became an exercise in managing the visual noise created by all the repeated initial parts of property keys. It was clear to me that using properties for the build file syntax just wasn't sustainable in the long term.

I wanted a hierarchical file format that would capture the way that Ant viewed the world. But I didn't want to create my own format. I wanted to use a standard one—and more importantly I didn't want to create a full parser for my own custom format. I wanted to reuse somebody else’s work. I wanted to take the easiest way possible.

At the time, XML was just breaking out onto the radar. The spec had was out and final, but not for long. SAX had become a de-facto standard, but we didn't yet have JAXP. I was convinced that XML was going to be the next big thing after Java. Portable code and portable data. Two buzzphrases that go well together.

Even better, since XML viewed data as a tree structure, it seemed like a perfect fit for the kinds of things that needed to be expressed in a build file. Add in that XML was still a hand-editable text-based format and it seemed like a marriage made in heaven. And, I didn't have to write a parser. The deal was done.

In retrospect, and many years later, XML probably wasn't as good a choice as it seemed at the time. I have now seen build files that are hundreds, and even thousands, of lines long and, at those sizes, it turns out that XML isn't quite as friendly a format to edit as I had hoped for. As well, when you mix XML and the interesting reflection based internals of Ant that provide easy extensibility with your own tasks, you end up with an environment which gives you quite a bit of power and flexibility of a scripting language—but with a whole lot of headache in trying to express that flexibility with angle brackets.

Now, I never intended for the file format to become a scripting language—after all, my original view of Ant was that there was a declaration of some properties that described the project and that the tasks written in Java performed all the logic. The current maintainers of Ant generally share the same feelings. But when I fused XML and task reflection in Ant, I put together something that is 70-80% of a scripting environment. I just didn't recognize it at the time. To deny that people will use it as a scripting language is equivalent to asking them to pretend that sugar isn't sweet.

If I knew then what I knew now, I would have tried using a real scripting language, such as JavaScript via the Rhino component or Python via JPython, with bindings to Java objects which implemented the functionality expressed in todays tasks. Then, there would be a first class way to express logic and we wouldn't be stuck with XML as a format that is too bulky for the way that people really want to use the tool.

Or maybe I should have just written a simple tree based text format that captured just what was needed to express a project and no more and which would avoid the temptation for people to want to build a Turing complete scripting environment out of Ant build files.

Both of these approaches would have meant more work for me at the time, but the result might have been better for the tens of thousands of people who use and edit Ant build files every day.

Hindsight is always 20/20.