Unix Review > Archives > 2002 > July 2002
Print-Friendly Version

July 2002

Regular Expressions: Economy of Means

by Cameron Laird and Kathryn Soraiz

Richard Suchenwirth is Tcl's Telemann. That is, he's both prolifically inventive and approachable.

During the week, Suchenwirth is a software development engineer for Siemens Dematic Postal Automation. His work includes character recognition in mail-sorting machines used by post-offices throughout the world. For a couple of years, though, he's entertained himself by composing little jewels of the scripting craft on his own weekend time. His remarkable output holds many lessons -- even for those who'll never work with the Tcl language.

Is "Simple" Good or Bad?

Tcl is known as a sparse language. Critics mocked Tcl during several famous episodes for its poverty, summarized in the slogan, "Everything is a string." That was intended to be harsh -- what good is a language that doesn't respect the difference between numbers and other characters and tries to do without arrays, references, and more abstract data structures?

But Tcl's fans have turned the slur on its head. For them, "everything is a string" epitomizes the language's simplicity and comprehensibility. Tcl programmers think about dataflow, with no need to involve themselves in "casting" or memory allocation. Suchenwirth's personal mantra in favor of simplicity is, "I'm not afraid of anything, if everything is a string!"

Suchenwirth showcases his creations in the Tcl-ers' Wiki , a collaborative repository for source code, explanations, and related material. A few dozen Tcl programmers typically contribute to this Wiki in any given week.

Suchenwirth indexes his own favorites on his page. His output generally falls into three broad and occasionally overlapping categories, which we call:

  1. "Whizzlet" revelations that demonstrate what impressive results Tcl can achieve in a few lines.
  2. "Hidden Wonders" that reveal the ease with which Tcl can have constructs that are generally regarded as restricted to other languages.
  3. "Decorations" or "Toys" that have no immediate practical significance beyond their aesthetic delight.

The "one-line Web browser" is an example of a Whizzlet. It's easiest to understand when rewritten slightly as separate commands:

  package req vfs
  vfs::urltype::Mount http
  set argv [list http://www.unixreview.com]
      # "hv.tcl" is part of the Tkhtml package.
  source hv.tcl

In isolation, this isn't particularly persuasive. Several modern languages have powerful libraries that bring advanced results within just a few lines of end-user programming. However, this example does illustrate the convenience of flattening out data into strings. Tcl's Tkhtml (HTML display engine) and VFS (virtual file system) authors worked independently, without ever needing to coordinate "header files" or "data structure formats". With all data appearing as strings, it was natural for their combination, as seen above, to work properly the first time it was coded.

While several people "touched up" the micro-browser, its most immediate author appears to have been Vince Darley, a scientist for Eurobios in London. Suchenwirth's own most compact whizzlet is the digital abbreviation of a clock:

	  proc every {ms body} {eval $body; after $ms [info level 0]}
	  pack [label .clock -textvar time]
	  every 1000 {set ::time [clock format [clock sec] -format %H:%M:%S]}

This is the time-keeping core of a slightly more verbose analogue display he keeps at: http://wiki.tcl.tk/2563.

Geographic Showcase

Suchenwirth combines several of these little nuggets of functionality into his "special favorite". Tclworld is a readable program that displays a map of the globe. The map is active; mouse clicks zoom it in and out, call up supplementary information on locations, provide geographic names, annotate the map, and so on. All data are human-readable; it takes only a moment to add your favorite locale or effect a name change.
Figure 1 (enlarge)

The Whizzlets show how far Tcl can go, even without the accessories other languages boast. Suppose you really want a read-only variable (a constant), though, or a lambda or generic operation or curry? Tcl doesn't build in any of these.

However, Suchenwirth and his colleagues have demonstrated that it takes only a few lines of Tcl to implement each one. Moreover, because Tcl is designed to be extensible, once you have the definition for any of these, you can use it as though it were built in. The new code you write has all the power and scope of the standard Tcl core.

 
Figure 2

One recent example is Suchenwirth's "functional imaging" project. Although Tcl is rarely regarded as a functional language in the way such languages as Haskell and Clean are, Suchenwirth has constructed commands that allow for Tcl programming in a functional style. He combines this with aspects of his lovely strimj project to model the Pan imaging research published by Microsoft's laboratories.

strimj is a collection of commands for manipulation of plain-text string representations of images. This might remind readers of the Scalable Vector Graphics (SVG) standard, which also renders graphic images in a plain-text format. (We will examine SVG later this year.) strimj differs from SAG, though, in its extreme compactness and simplicity. As Suchenwirth writes, it makes graphics programming so easy that his 11-year-old daughter can correct his mistakes. He demonstrates strimj's power with construction of a Mongolian font (!) and an animation system.

Visual Delights

The final broad category in this collection has "toys" -- curiosities intended as much to entertain as anything. Keith Vetter of SoftBook Press, another Tcl contributor, recently wrote "Octabug", a captivating animation of geometric morphing. Early last year, Suchenwirth and others worked on several clever animations of locomotive trains. Although none of these images compares with game- or film-quality computer-generated graphics, each takes only a few lines of understandable Tcl to create. That's what makes them fascinating -- they achieve so much with so little.

 
Figure 3 (enlarge)

Suchenwirth's public output teaches programmers several lessons, whether they're working in Tcl or other languages. Most immediate are these:

  • Simplify, simplify. Even the most elementary programming constructs can do great things, if used cleverly.

  • Study the greats. Learn how to use the idioms of your favorite language by reading source code from the experts.

  • From the right vantage point, almost anything can be beautiful. You might think you need linked lists of abstract objects for a particular problem, but squinting your eyes the right way sometimes turns up a perfectly good solution in terms of strings and subroutines.

  • Extensibility is a powerful idea. Look for ways to use your favorite language to manage "little languages" that most aptly express your problem domains.

One puzzle that Suchenwirth's demonstrations leave unanswered is why he's so generous. Why does he put so much expertise and effort into programs he gives away for free? Suchenwirth expressed in a private email that he was "not only a giver to the Wiki, but every now and then also a taker," referring to occasions when others point out his mistakes or improve on his work. For the most part, though, it is his own amazement at the results from his "fun projects" that compels him "to share that amazement (and the explanations and code) with the Tcl world out there." The "free and uncensored publication medium" of a Wiki is the ideal outlet for someone who already has co-produced a couple of books and whose work-day programming is more narrow than the span of his imagination.

Browsing the Tcl-ers' Wiki is a good idea, even if you don't intend to become expert in Tcl. If nothing else, follow the recipe that appears on the Wiki page titled, "When all you want is to run a Tk application" and enjoy the animations and other graphics on your own screen. All the topics mentioned in this column, along with many related ones (How can you enter Russian words on an English keyboard? How can you embed a proof engine in just a few lines?) appear on pages whose URLs appear above. Also see http://www.regularexpressions.com/ for pointers to older columns that touch on these subjects.

Hint of a "cult of personality" concerns Suchenwirth. He's far more comfortable dealing with programming ideas than with the spotlight of attention turned on him as an individual. ActiveState Tools Corporation, though, has turned just that spotlight on him and over a dozen other open-source scripting contributors. Its second annual Active contest for top programmers is closing on July 17, 2002, just about the time this column appears. Visit the award Web site to learn more about these luminaries and, if you're in time, cast your own vote.

Finally, note that there are thousands of collaborative Wikis. While the Tcl-ers' is one of the best, we'll be reporting in the future on highlights from several other projects, including the recently revitalized "Python Wiki" at: http://www.python.org/cgi-bin/moinmoin/.

Sys Admin Spotlight

CMP DevNet Spotlight

Global Web Site Performance Improvement
Jeffrey Fulmer explains how to get a comprehensive picture of your site's performance and describes some tips for improving it.

In the News

Oracle Makes Offer To Buy BEA Systems For $6.66 Billion
Oracle Corp on Friday said it offered to buy BEA Systems Inc for about $6.66 billion, its latest effort to up the ante against Microsoft Corp. <MSFT.O> and Germany's SAP <SAP.N> in the fiercely competitive market for business software.


Google Adds YouTube To Global Mapping Service
Users can watch, for example, YouTube video of surfing, snorkeling, and exotic sea life in Maui, Hawaii, while getting maps to points of interests.


For Tired Computer Users: A Headband To Tell You When To Quit
A Tufts University team wants to expand on technology that uses near-infrared spectroscopy sensors to measure the brain's emotional state.


RIAA File-Sharing Verdict Delays Day Of Reckoning On Downloading

I'm sorry to disappoint the record companies, but the Recording Industry Association of America's legal victory against Jammie Thomas, who was ordered by a Minnesota court to pay $220,000 in damages for sharing songs over Kazaa, changes nothing. Kids still steal most of their music, and the recording industry hasn't accepted the reality that it has to bag both the CD and DRM before it has a prayer of reviving itself.


Apple Launches iPhone Web Apps Directory
The featured Web software includes a Facebook application that connects the iPhone to friends' pages, allowing users to upload and share photos, or send and receive messages.


Four Groups Vie For Two Japan WiMax Licenses
NTT DoCoMo, Softbank, and KDDI hope to supply high speed wireless Internet access, eager to bet billions of yen on cheap and quick file downloads on the run.


Lift-Off Modest For Virgin Mobile IPO
The company's first day 5% share increase is positive considering some of the highly publicized blow-ups in the MVNO market last year.


CD-ROM

Sys Admin and The Perl Journal CD-ROM version 11.0

Version 11.0 delivers every issue of Sys Admin from 1992 through 2005 and every issue of The Perl Journal from 1996-2002 in one convenient CD-ROM!

Order now!




MarketPlace

MS Office Accounting 2007
Easy to use award winning software. Buy Pro version for only $149.95!

Atlassian JIRA: Lightweight Bug Tracking Tool
Fast. Flexible. Powerful. Secure. J2EE, web-based bug and issue tracker. JOLT award winner.

Easy & Powerful Server Monitoring that Just Works
"What an impressive product this is..." - UnixReview.com. Free Trial Download and Product Tour.

Discover WinDev 11 RAD
and develop 10 times faster ! ALM, IDE, .Net, PDF, 5GL, Database, 64-bit, etc. Free Express version

Wanna see your ad here?