Dirty SNOBOL Page: Don't try this at home!

SNOBOL and SPITBOL are programming languages that favour dirty programming. In the past decade or so, I came up with quite a few really horrible pieces of code, that, yet, are quite fascinating to explore. Check out these Programs for Pattern Lovers, Spaghetti Statements, and Gorgeous Goto Constructions!

Problem 1: Bottles of Beer

In the following link, solutions in all kinds of programming languages for the "99 bottles of beer" song are given. See if your favorite language is among them:

99 Bottles of Beer

Not all of the codings give complete solutions. The counter should go from 99 to 0, at 1 it should give "1 bottle" (or, better: "just one bottle"), and at 0, it should give "no more bottles".

With this in mind, straightforward SPITBOL coding would give you:

*-----------------------------------------------------*
          Bottles = (N = 99) " bottles"
Sing      TERMINAL = Bottles " of beer on the wall,"
          TERMINAL = Bottles " of beer"
          TERMINAL = "Take one down and pass it around"
          N =  N - 1                            :F(END)
          Bottles = (EQ(N,0) "No more",N) " bottle"
+                   (EQ(N,1) , "s")
          TERMINAL = Bottles " of beer on the wall,"
          EQ(N,0)                              :S(Sing)
END
*-----------------------------------------------------*

OK, this is pretty straightforward, right? Let's try it another way (no arithmetic, no labels, no gotos, no control structures -that's what you think):

*-----------------------------------------------------------*
          Beer    = " of beer"
          Wall    = " on the wall"
          Take    = "Take one down and pass it around"
          NL      = CHAR(10) CHAR(13)
          Bottle  = *?((B1 = (EQ(N1,1) "Just one bottle",
+                                   N1 " bottles"))
+                      (B2 = (EQ(N2,1) "Just one bottle",
+                             (EQ(N2,0) "No more",N2)
+                              " bottles")))
          Sing    = *?(TERMINAL = B1 Beer Wall NL B1 Beer NL
+                                 Take NL B2 Beer Wall NL)
          Supply = DUPL("B",99)
          Count  = ("B" *Count | "")
          Remove = @N2 "B" @N1
          Supply ? FENCE Count Remove Bottle Sing FAIL
END
*-----------------------------------------------------------*

Eat this. It works, but it is definitely a piece of sick programming. Last line looks a bit like FORTH doesn't it? But where do you think is the control structure? Where is the counter? How in the world can this work?

Problem 2: Formatting Text

This problem originates back in the mid-eighties, when several students and associates of the department of Computational Linguistics took on a bet regarding a Spitbol program that would have to format a raw input text. That is to say, the program had to read raw text, split it up into lines less than a given size, and then fill up the lines with blanks up till the text width. This filling had to be done left to right and right to left on alternating lines.

I started when I came up with a rather short version. This was essentially the following (I have adapted it for PC-Spitbol; back in those days we worked with Spitbol 370 on a VM/CMS machine):

*--------------------------------------------------------------------*
        INPUT(.In,1,"File1.txt")
        OUTPUT(.Out,2,"File2.txt")
        NoBlanks = NOTANY(' ') | NOTANY(' ') *NoBlanks
        Tw = 60
Init    Word @P Word $ Line
        I = 1                                                  :(Word)
Buffer  Buffer = TRIM(In)                                     :F(Last)
Word    Buffer ((NOTANY(" ") BREAK(" ")) $ Word SPAN(" ")) |
+               (NOTANY(" ") RTAB(0)) $ Word =              :F(Buffer)
        Line = (IDENT(Line) , Line " ")
+               LT((SIZE(Line) + SIZE(Word)),Tw) Word         :S(Word)
        Out = IDENT(Line) GE(SIZE(Word),Tw) Word              :S(Word)
        Line POS(0) NoBlanks . Out RPOS(0)                    :S(Init)
        FlipFlop = 1 - FlipFlop
        Line = EQ(FlipFlop,1) REVERSE(Line)
Test    Out = EQ(SIZE(Line),Tw)
+             (EQ(FlipFlop,0) Line,REVERSE(Line))             :S(Init)
        Line POS(P) BREAKX(" ") $ v DUPL(" ",I) NOTANY(" ") $ a @P =
+              v DUPL(' ',I + 1) a                            :S(Test)
        Line POS(0) @P LEN(I + 1) @I                           :(Test)
Last    Out = Line
END
*--------------------------------------------------------------------*

This program, with only 13 statements and fitting on the back of a matchbox challenged several people to write a version with less statements. Unfortunately, I cannot recall the consecutive record breakers, but up till now, noone has done it in less than three statements. The following, reall horrible and incredibly dirty version of three Spitbol statements was my last contribution to the bet. We agreed at the time, that noone could do it better. Well, there is a solution: put the whole program in one string, and execute that in one CODE() statement, but we agreed that this "solution" was invalid):

*-------------------------------------------------------------------*
Loop         "60 " ? BREAK(" ") $ Tw *GE(Tw,1)
+                    INPUT("0",1,"File1.txt")
+                    OUTPUT(.Out,2,"File2.txt")
+                    DEFINE("Read()")
+                    DEFINE("Format(Line,P)Add")
+                    SUCCEED *~(Format(Format(Read()),"Print")
+                    Format(REVERSE(Format(REVERSE(Read()))),
+                    "Print") ) *Format(Buffer,"Print")   :(END)
Read         Buffer ? ((POS(0) BREAKX(" ") " ") . x SPAN(" ")) |
+                     (POS(0) BREAKX(" ") . Read @P
+                    ((*LE(P,Tw)
+                        SPAN(" ") BREAK(" ") . x @Q *GT(Q,Tw) ) |
+                     (*GT(P,Tw)
+                        SPAN(" ") (REM | NULL) . x ) ) ) |
+                    (RPOS(0) $ x )
+             = TRIM(REVERSE(TRIM(REVERSE(x " " $(SIZE(x)))))) " "
+                                                        :F(FRETURN)
+                                 S($((DIFFER(Read) .RETURN,.Read)))
Format       Line ? (*IDENT(P,"Print") REM $ Out ABORT) |
+                   (*GE(SIZE(Line),Tw) REM $ Format ABORT) |
+                   (POS(0) (LEN(Add + 1) . pp BREAK(" ") . pre
+                    SPAN(" ") . Target @Add) |
+                   (NULL . pp BREAK(" ") . pre
+                    SPAN(" ") . Target @Add) |
+                  (LEN(SIZE(Line)) $ Format ABORT) ) =
+                   pp pre Target " "             :S(Format)F(RETURN)
*-------------------------------------------------------------------*
END

If anyone can read this, or figure out how it works, he or she is at least as mad as I am. Some people should never be allowed to write computer code again. However: it works. Download it to your PC, run it with Catspaw's Spitbol 386 compiler and bingo.

If anyone thinks he can do better, these are the rules:

Problem 3: Xmas greetings

This is a little program I wrote in december 1990. I sent it to all people I knew that worked with Snobol or Spitbol. It's not easy to figure out how it works. It certainly qualifies as dirty. If you can predict the outcome without running it, drop me a mail. I want to meet you.

*---------------------------------------------------------------------*
*                                                                     *
*      For SNOBOL4 fans only                                          *
*                                                                     *        
*      A New Year's wish, written in 1990                             *
*                                                                     *        
*      Peter-Arno Coppen                                              *
*                                                                     *
*---------------------------------------------------------------------*
       DEFINE("ADMINISTRATION(C,P)")
       DEFINE("PICK(X)")
       OPSYN("!","BREAK",1)
       OPSYN("#","DUPL",2)
       OPSYN("/","SIZE",1)
       OPSYN("|","LEN",1)
       NEW        = 91 (THIS WAS NEXT YEAR)
       OOPS       = TABLE()
       STAR       = "*"
       CODE       = "NROBAEYIFAP .OT YCS Y.WNMSPERMEPN R "
+                    " DLX R4- EORHAALPEDA EYNNA AT  MNOLS PROE"
       UNRAVEL    = SUCCEED TAB(*PICK(SIZE(CODE))) LEN(1) $ C @P
       GROW       = (!STAR $ W *(STAR # (1 + 2 * (7 - /W)))) $ TERMINAL
+                    *IDENT(W) (|9 $ TERMINAL | |9 $ TERMINAL ABORT)
+                    FAIL
       TREE       = (" " # 7) (STAR # 15) (" " # 6) ("|" # 3)
*---------------------------------------------------------------------*
       CODE       ? UNRAVEL *ADMINISTRATION(C,P)
*---------------------------------------------------------------------*
       TERMINAL   = BEST WISHES TO ALL OF YOU
       TERMINAL   = AND WATCH THE
                    TREE ? GROW                                  :(END)
*---------------------------------------------------------------------*
ADMINISTRATION
+      OOPS<P>    = IDENT(OOPS<P>) 1                        :F(FRETURN)
       WISHES     = WISHES C
       WISHES     ? TAB(SIZE(CODE))                :F(FRETURN)S(RETURN)
*---------------------------------------------------------------------*
PICK           NEW        = REMDR(NEW * 41,211)
       PICK       = REMDR(NEW,X)                              :(RETURN)
*---------------------------------------------------------------------*
END

Problem 4: String splitting

This is a problem that many SNOBOL programmers will encounter sooner or later. You have this file that contains records longer than your ASCII printer can handle and you want to split file in records less than, say, 72 characters. But you want to break at a space of course, ignoring trailing (and probably leading) spaces.

I presented this as a puzzle to the SNOBOL4 mailing list (snobol4@mercury.dsu.edu), and three people joined a discussion: Ulf Bro, who came with a solution very different from the one I had in mind, and of which -I confess- I couldn't see how it worked, Gordon Peterson II, who solved the puzzle in one pattern match, not unlike the way I did it in two, namely reversing the string, and finally Robert Dewar, who merely suggested a way to do it.

I will present my last posting on the subject so far, as it summarizes the discussion a bit:

I'm truly sorry, but those of you who think this is going to far better delete this mail right now. The string splitting algorithm keeps haunting me, and I did some testing that 2 or 3 of you might find interesting. Maybe there's a lesson to be learned somewhere, but I'm too crazy to see it.

Remember the problem? Split a string into pieces of 72 characters or less, breaking on spaces. Ulf Bro and Gordon Peterson II came up with elegant pieces of code that were designed to split one large string into smaller strings and directly assigning them to output. What I originally had in mind was just the "reverse-the-string" idea which worked fine in many of my programs in the last 20 years or so. On top of it all, Dewar suggested a single match solution that would even be faster.

Naturally, the way to evaluate the various claims was to implement them and see what they did. So I took a moderately large file (about 55k) with some raw text I typed in using a normal ascii editor. I then adapted Ulf's and Gordon's pieces of code to make it resemble mine, and I programmed Dewar's suggestions in two different patterns, one recursive and the other one a SUCCEED-FAIL trick.

Here are the 5 programs I used:

*--------------------------------------------------------------------*
*
*               My own solution       
*
*--------------------------------------------------------------------*
               Start = TIME()
               &anchor = 1
               &trim = 1
               TW = 72
Read            lang = lang REPLACE(input,CHAR(9)," ") " "    :S(Read)
Loop           REVERSE(lang) ?
+                    (RTAB(TW + 1) BREAK(" ") @Q SPAN(" ") | "") @P
               lang (TAB(1) RTAB(*P)) $ output
+                   RTAB(*Q) =                                :S(Loop)
                TERMINAL = (TIME() - Start) " msec"
END
*--------------------------------------------------------------------*

*
*              Algorithm by Ulf Bro
*
*
               Start = time()
               &trim = 1
               TW = 72
Read           lang = lang REPLACE(input,CHAR(9)," ") " "     :S(Read)
beiss          lang LEN(TW + 1) . kurz =                      :S(teil)
               output = lang                                   :(last)
teil           kurz = kurz " "
               kurz BREAK(" ") . letzt POS(TW + 1) =      
               output = kurz
               lang = letzt lang                              :(beiss)
last           TERMINAL = (time() - Start) " msec"
END
*--------------------------------------------------------------------*
*
*       Algorithm by Gordon Peterson II
*
*--------------------------------------------------------------------*
        Start = time()
        &trim = 1
        TW = 72
read    lang = lang replace(input,char(9)," ") " "            :s(read)
        lang = reverse(lang)                   :f(last)
reploop lang fence (((rtab(TW + 1) break(" ")) . lang2 len(1)) |
+       (rtab(TW + 1) . lang2) |
+       (null . lang2)) (len(1) rem) . outstr =
+       lang2 ?(output = reverse(outstr))                  :s(reploop)
last    terminal = (time() - Start) " msec"
end
*--------------------------------------------------------------------*
*
*       Dewar's suggestion in a recursive pattern       
*
*--------------------------------------------------------------------*
           Start = TIME()
           &trim = 1
           &anchor = 1
           TW = 72
           Print     = *?(output = X) SPAN(" ") *Break ABORT
           Check     = *LE(Q - P,TW)
           LookAhead = SPAN(" ") BREAK(" ") @R *LE(R - P,TW)
           Break = (@P BREAKX(" ") @Q
+                  FENCE(Check FENCE(LookAhead | Print) | Print)) $ X
+                  FAIL | RTAB(0) $ output
Read       lang = lang REPLACE(input,CHAR(9)," ") " "         :S(Read)
           lang Break
           TERMINAL = (TIME() - Start) " msec"
END
*--------------------------------------------------------------------*
*
*               Dewar's suggestion in a SUCCEED FAIL pattern
*
*--------------------------------------------------------------------*
               Start = TIME()
               &anchor = 1
               &trim = 1
               TW = 72
               Initialize = *?(L = TW + 1)
               Begin      = TAB(*P)
               Spaces     = FENCE(SPAN(" ") | "")
               EOS        = (RPOS(0) ABORT | "")
               Line       = LEN(*(L = L - 1)) $ X " " @P
               Print      = *?(output = X)
               Break = Initialize SUCCEED
+                      Begin Spaces EOS Line Print
+                      Initialize FAIL
+                      (RPOS(0) ABORT | "")
Read           lang = lang REPLACE(input,CHAR(9)," ") " "     :S(Read)
               lang Break
               TERMINAL = (TIME() - Start) " msec"
END
*--------------------------------------------------------------------*

Timing results are given at the bottom of this file. The results were very disappointing for my own solution. It was by far the slowest. Well, I couldn't believe I had been using the slowest algorithm all these years, so I designed a second test, in which I used all algorithms in the way I usually did with my own: namely, reading record by record and chopping of whatever is possible. Any user of Macro Spitbol from the days its heap was only 64k will know why I always did that. Anyway, I had to rewrite Ulf's and Gordon's algorithms a bit (see for yourself, boys, whether I did it correctly). This resulted in the following 5 programs:


               Start = TIME()
               &ANCHOR = 1
               &trim = 1
               TW = 72
Read           lang = lang REPLACE(input,CHAR(9)," ") " "      :F(Last)
Loop           REVERSE(lang) ?
+                    RTAB(TW + 1) BREAK(" ") @Q SPAN(" ") @P   :F(Read)
               lang (TAB(1) RTAB(*P)) $ output
+                   RTAB(*Q) =                                 :S(Loop)
Last           output   = lang
               TERMINAL = (TIME() - Start) " msec"
END
*
*              Algorithm by Ulf Bro
*
*
               Start = time()
               &trim = 1
               TW = 72
Read           lang = lang REPLACE(input,CHAR(9)," ") " "      :F(last)
beiss          lang LEN(TW + 1) . kurz =                :S(teil)F(Read)
teil           kurz = kurz " "
               kurz BREAK(" ") . letzt POS(TW + 1) =   
               output = kurz
               lang = letzt lang                               :(beiss)
last           output = lang
               TERMINAL = (time() - Start) " msec"
END

*
*      Algorithm by Gordon Peterson II
*
*
        Start = time()
        &trim = 1
        TW = 72
read    lang = reverse(replace(input,char(9)," ") " ") lang    :f(last)
reploop lang (fence(tab(TW + 1) | abort) fail | "")
+            fence (((rtab(TW + 1) break(" ")) . lang2 len(1)) |
+       (rtab(TW + 1) . lang2) |
+       (null . lang2)) (len(1) rem) . outstr =
+       lang2 ?(output = reverse(outstr))            :s(reploop)f(read)
last    output = reverse(lang)
        terminal = (time() - Start) " msec"
end

           Start = TIME()
           &trim = 1
           &anchor = 1
           TW = 72
           Print     = *?(output = X) SPAN(" ") *Break ABORT
           Check     = *LE(Q - P,TW)
           LookAhead = SPAN(" ") BREAK(" ") @R *LE(R - P,TW)
           Break = @P (FENCE(LEN(*(TW + 1)) | ABORT) FAIL | "")
+                  FENCE(SPAN(" ") | "") (@P BREAKX(" ") @Q
+                  FENCE(Check FENCE(LookAhead | Print) | Print)) $ X
+                  FAIL | @P
Read       lang = lang REPLACE(input,CHAR(9)," ") " "         :f(last)
           lang Break
           lang TAB(P) =                                       :(Read)
last       output = lang
           TERMINAL = (TIME() - Start) " msec."
END

           Start = TIME()
           &anchor = 1
           &trim = 1
           TW = 72
           Initialize = *?(L = TW + 1)
           Begin      = TAB(*P)
           Spaces     = FENCE(SPAN(" ") | "")
           EOS        = (RPOS(0) ABORT | "")
           Enough     = @P (FENCE(LEN(*L) | ABORT) FAIL | "")
           Line       = LEN(*(L = L - 1)) $ X " " @P
           Print      = *?(output = X)
           Break = @P Initialize SUCCEED Begin Spaces
+                  EOS Enough Line Print
+                  Initialize FAIL
+                   (RPOS(0) ABORT | "")
Read       lang = lang REPLACE(input,CHAR(9)," ") " " :f(last)
           lang Break
           lang TAB(P) =                              :(Read)
last       output = lang
           TERMINAL = (TIME() - Start) " msec"
END
*--------------------------------------------------------------------*

Surprise, surprise. Here are the timings. I timed running a few times and taking the average runtime for the whole program. Reading and writing went through standard IO:


*--------------------------------------------------------------------*
                   P.A.    ULF  GORDON  RECURSIVE  SUCCEED
LONG STRING       30100  25260   27130    13780     13350
SHORT STRINGS      1700   2115    2620     2250      2040
*--------------------------------------------------------------------*

Apparently, the short string version is best. I suppose this comes from the fact that messing around with lots of data is expensive. Also, it might be the case that the abundance of spaces in a normal text favours the short string solutions. So it might not be the best way in all cases.

Anyway, I think I'll keep doing it my way after all.