Special: Retaining Content Control
Security Home Page Security Webcasts Security White Papers Security Newsletters Security News Open Topics Careers ITworld Voices ITwhirled The Security site of ITworld.com

Master Foo's Taxation Theory of Microformats

ITworld.com 5/1/2006

Sean McGrath, ITworld.com

One bright spring morning in 2006, a delegation of knowledge management experts arrived at the zenith of Pentimenti mountain to meet with Master Foo. Their trip to the mountain top had been a noisy one, featuring a non-stop debate about ontologies, RDF, XML and the Semantic Web.

As they approached him, Master Foo stuffed lumps of his flowing gray beard into his ears to drown out the noise. The knowledge management experts on seeing this, fell silent and sat down.

"Are you familiar", Master Foo began, "with the concept of Value Added Tax?".

The experts were taken aback by this question. Coming from anybody else other than Master Foo, it would have elicited a dismissive response. However, Master Foo's reputation for bizarre analogies was well known to the experts.

"It is a form of taxation in which the value chain itself collects tax on behalf of government.", one of the experts explained. "Enterprises that buy goods to sell on, can re-claim the value added tax (VAT). Only the final consumer actually pays the VAT."

"Indeed", said Master Foo. "Only the consumer pays the VAT. Now, pretend that complexity in information systems is a form of taxation. Would the VAT model be an appropriate model?"

"Hmmm. I don't think so", another expert began. "In information systems, complexity is normally spread out along the value chain. If I publish some information in a custom XML format for example, all consumers and intermediaries who process that information have to pay the XML complexity tax. They need to know about my custom schema, about validation, about stylesheets, about transformation and so on."

"Indeed", said Master Foo. "Now who benefits from the fact that the information is in XML?"

"Well, everyone benefits", responded the expert, puffing out his chest. "Structured information is fundamentally more useful than unstructured information all the way up and down the value chain!"

Master Foo stroked his beard. "But what if I just want to see a piece of information or print it or search it or e-mail it or copy/paste it into my word processor? Should I have to pay the XML complexity tax for these simple requirements?"

"Well..., no, I suppose not.", replied the expert, wondering what sort of trouble he was letting himself in for with this answer.

"Complexity is like taxation.", Master Foo began. "Producers and consumers alike will always see a tax as unfair unless they can see benefits to themselves in paying it. XML - as classically articulated - suffers from the fact that the costs and benefits associated with its usage occur in different places in the value chain. As a publisher, I will definitely benefit from XML but as a consumer, I will only benefit from it if I have (a) technical expertise and (b) non-trivial integration requirements. If I just want to print or search or e-mail or copy/paste, the XML appears to be an unfair complexity tax to me."

"But Master Foo", a third expert began, "we all agree that the simple things should be simple and that the hard things should be possible. It is unfortunate that XML is perceived as complex. Perhaps this is just a passing phase while browsers/e-mail readers/word processors and so on become more XML aware? Perhaps the simple case of printing/searching/copy/pasting will be simple in a few years time?"

"I remember a visit from the SGML experts to this very mountain, about 20 years ago. They said the same thing. Then the XML experts came about 10 years ago. They said the same thing too. I detect a pattern..."

More beard stroking ensued as Master Foo struggled to deliver himself of this thought:

"History has shown that the only way to reduce information processing complexity to zero is for applications on the value chain to have build-it knowledge of the information formats they work with. XML, RDF, UML and so on are not formats so much as meta-formats. Consequently, built-in knowledge of any of these does not amount to zero complexity. XML is not a file format. It is a file format for file formats. This distinction is all important when calculating complexity.

"Today, XML is suffering from the fact that users realize how little the phrase 'XML aware' actually means. The complexity persists all the way along the value chain. They compare the ease of handling word processor files or HTML files with the handling of XML files and scratch their profit/loss spreadsheets in wonder. Why is it so complicated?"

Master Foo paused, his eyes twinkling in a way that indicated that he knew the answer but wanted one of the experts to say it first...

"Perhaps..", one of the experts began. "Perhaps we need some way to reduce the complexity tax by hiding the XML better than we have done heretofore..."

"Go on", said Master Foo with the faint hint of a smile crossing his lips.

"Perhaps we can use the fact that mainstream file formats are migrating rapidly to fixed XML-based notation i.e. XHTML, ODF, RSS/Atom and so on..."

"Yes, go on...", said Master Foo.

"Perhaps we can hide custom XML languages *inside* these standard XML languages, using them as general-purpose semantic containers. We could arrange that XHTML files or ODT files look and feel like ordinary files to XHTML-aware or ODT-aware tools and yet, inside these files we can hide the semantic information we would normally use a custom XML schema for."

"Indeed", said Master Foo. "Doing so would reduce the complexity tax to almost zero. end-users or re-sellers who simply need to print/copy/search/e-mail information can do so without even knowing that the files that are processing contain hidden XML-based languages."

"And yet,", he continued, "the harmless looking files would contain all the information that a skilled XML-aware engineer would need in order to extract the semantic structures from the files as required."

Approximately half of the experts stayed seated, deep in thought. The remainder started on their way down the mountain, muttering under their breaths that Master Foo was nuts.

"Does this concept have a name Master Foo?", a remaining expert asked.

"There is one guy just as crazy as I who calls it Semantic Steganography[1]. Most normal people will know of it as MicroFormats[2]."

"We will take a closer look at this technique Master Foo. Thank you for your time."

The experts made very little noise going down the mountain.

[1] http://seanmcgrath.blogspot.com/2005_04_17_seanmcgrath_archive.html#111373018875397589

[2] http://microformats.org/

On this topic


Sean McGrath is CTO of Propylon. He is an internationally acknowledged authority on XML and related standards. He served as an invited expert to the W3C's Expert Group that defined XML in 1998. He is the author of three books on markup languages published by Prentice Hall. Visit his site at: http://seanmcgrath.blogspot.com.

Sponsored Links

Increase Your Existing Data Center Capabilities
Robert Frances Group believes that the enterprise data center is undergoing significant and sustained transformation�and that Azul Systems has delivered on a new approach to delivering processing and memory resources to enterprise applications...
Best Practice IT Service Delivery White Paper
Register your details with us for a free white paper on open source ITSM best practice
Servers & IT Auction
Multi-Million Dollar Auction Featuring a Large Quantity of High-End Cisco, IBM & Sun Microsystems Servers and More. Great Deals at Auction!
Identify, Visualize, and Quantify Security Risks
Get your FREE guide 5 Questions for Every CISO: Managing Risk Exposure and learn more about how you can now better identify, visualize, and quantify information security risks to optimize security investments.�VeriSign(R) Managed Security Services
Accounting & CRM Software-MAS 90 500 SalesLogix NY
Shelko Consulting are software consultants that will design, implement, customize, integrate, manage, and support the following comprehensive Best Software applications: MAS 90, MAS 200 & MAS 500 accounting, & SalesLogix CRM software systems.
» Buy a link now

Sponsored links
Simple and flexible server technology from HP.
Free 180-day SQL Server 2005 Trial Software � Go Now!
Develop once deploy everywhere!
Linux or Windows? Get the facts.
UNIX, Linux, Windows 32 or 64-bit� develop once deploy everywhere!
Free Security Tools & Updates from Microsoft.
Realize the Full Benefits of XML Web Services and SOA.
Discover the Recent Trends in Application Acceleration
New Webcast Takes A Strategic Look at Email Security
Webcast - Managing IT Performance Across the Life Cycle
 Home   Open standards  XML
www.itworld.com    open.itworld.com     security.itworld.com     smallbusiness.itworld.com
storage.itworld.com     utilitycomputing.itworld.com     wireless.itworld.com
Contact Us   About Us   Privacy Policy    Terms of Service   Webcast & Marketing Solutions
Copyright © Accela Communications, Inc. All rights reserved