Way back in the Thirties, the linguist Benjamin Lee Whorf spent some
time investigating the language of the Hopi Indians in Arizona. Hopi is
a very interesting language that does not have a mechanism for
distinguishing past, present, and future. This fascinated Whorf, who was
interested in the degree to which the languages we speak affect the way
we view the world.
To cut a long and utterly fascinating story very short, the Sapir-Whorf
Hypothesis proposes that the language we use -- at least to some extent
-- determines the way we view and think about the world around us.
Cut to the year 2002.
The place: A work cubicle near you.
The scene: Two developers fight over how best to model a piece of
Developer A: "It is obviously an element with a sub-element."
Developer B: "No, it is clearly a single element type with an
The Sapir-Whorf Hypothesis adds an interesting dimension to such debates
that I first appreciated having read William Kent's excellent book "Data
and Reality". I need to paraphrase Kent here as his book predates XML
by a quarter of a century:
In XML, you are more likely to model a concept as an element rather
than an attribute if the natural language you are working in has a
noun for the concept.
Cut to some time in the near future.
The place: A work cubicle somewhere in Ireland.
The scene: A developer ponders an XML model of a Bishop.
The model that results from this modeling exercise depend on whether the
developer thinks in Irish or English. Why? Because in the Irish
language, a single word -- a noun -- means "back of the knee".
The word is "iscoid". Now before you ask, I do not have a 100,000 word
Irish vocabulary! The reason I know this obscure word is that I remember
a tongue twister I learned as a kid that goes like this:
Ta niscoid ar iscoid an Easpaig, agus ta imni ar an Easpaig faoin
niscoid ata are a iscoid.
Translated into English this says:
There is a boil on the back of the Bishop's knee. The Bishop is
worried about the boil that is on the back of his knee.
So, a developer thinking in Irish is likely to model the back of a
Bishop's knee as a sub-element of a knee element:
A developer thinking in English is likely to model it in terms of its
position relative to the rest of the knee (or indeed the Bishop):
<knee position = "back">
A lecture of mine in College was fond of saying that Computer Science is
basically mathematics with a bit of English thrown in. In a similar
vein, XML modeling could be viewed as language with a bit of Computer
Science thrown in.
Modeling aircraft must be a fun experience in Hopi as all things that
fly share the same noun: "masa'ytaka".
Try handling that modeling problem without resorting to attributes!
 1stBooks Library, ISBN: 1585009709