Atari BASIC

The Good, the Bad, and the Ugly

Introduction

It's been a long time since I last saw my Atari 400. I don't remember the date exactly, but I do remember the circumstances: hooking it up in the living room one last time to prove it worked before selling it for $90. Ever since buying my 520ST (what a stinker) I hadn't been using my 8-bit much and as a poor student I thought I needed the money.

So it was a melancholy end to what was likely one of the most educational experiences of my life; learning about computers on the wonderful 6502. This story is about one part of that experience; learning, loving and hating Atari BASIC.

The other BASICs

In the late 1970's the personal computer world was dominated by three players: CP/M, Apple II, and the TRS-80. There were any number of other machines were out there but none of them were really a force in the market. One possible exception was the Commodore PET, which did well in schools due to its tough all-metal construction.

As was typical of the day all of the machines were completely different. The CP/M world was all about professionalism and standardization. They offered huge cases for expansion, floppy disks even though they cost thousands, and even a whopping 48k of RAM. The TRS-80 was based on the same Z-80 chip as the CP/M machines but used it's own OS and expansion system. Its main claim to fame was that it was sold in Radio Shack's all over the US. The Apple was based on the radically different 6502 and was about color graphics, a feature few machines offered while still running on TV's as the display.

Yet all of these machines did share one thing in common, they all came with BASIC. In fact the vast majority of the BASICs out there were in based on some variant of the original Microsoft BASIC for the Altair 8800, which was (or would be) the first CP/M machine. Only the TRS-80 version was significantly different. As time went on and the market consoldated, the MS version became even more a de-facto standard.

It's not that MS BASIC was anything to write home about. It was never very fast, and certain parts of the syntax give me the shivers even today. But it did cover all of the bases in terms of the commands it supported, and was ported to just about every platform. Although you lost much of the power of your particular platform when writing to "standard" MS BASIC, you could write a program for it and be relatively sure it would run on just about any machine out there.

Given the amount of software that was written in BASIC back then (the computer was slow anyway...) it quickly started to re-enforce itself in strange ways. Even if your machine's BASIC didn't use MS as its basis, MS programs were so common that you quickly learned how to convert them into and out of your dialect. So MS BASIC became became the Java of it's day.

Atari BASIC

The machines that would go on to become the Atari home computers had started their life as a video game machine, and then changed in the midst of the design cycle into computers. And what does every computer of the era need? BASIC!

The only problem was that Atari had never sold a general purpose computer before, and there was essentially no experience with programming languages in the company. So like anyone faced with this problem, they went out and bought a copy of Microsoft BASIC, in this case the new "8k version". That didn't fix the problem though.

MS BASIC had been written for the 8080 CPU used in the CP/M machines, and the port to the 6502 had expanded it to something around 12k. It was also incomplete and almost entirely undocumented. As a result the Atari engineers were struggling to get the BASIC onto an 8k cartridge, even without any support for some of the graphics and sound features that would really make the machine a success. They eventually threw in the towel and went shopping for someone who could get BASIC onto the machine in time for their release at the 1978 winter CES (January 1979). They found SMI.

SMI was a small programming company with impressive credentials: they had written a port of BASIC to the Apple II, wrote Apple II DOS, and were in the midst of polishing off Cromemco Extended BASIC. I think the Atari team was expecting them to simply get the MS code up and running, but the SMI team came back with a proposal to create an entirely new version of the language. The result was Atari BASIC, and Atari BASIC was different.

Atari BASIC, the Good

As I now understand it there were essentially two BASICs in the world – on the one side were the DEC-a-likes, and on the other the Data General-a-likes. When Bill and Paul started to develop BASIC for the Altair they copied the DEC version and the result was MS BASIC. The SMI folks started with the DG version and Atari BASIC ended up "the other way". Because both of the languages had the same distant ancestors they had many features in common, but vive la differance!

Atari BASIC has one of the clearest syntaxes of any BASIC I've seen – and that includes modern languages that have been given the name "BASIC". A couple of examples are in order.

In all versions of BASIC you use the command LIST to view the contents of the currently loaded program. The command typically supports three basic modes:

LIST : list the entire program
LIST 10 : list only line 10 of the program
LIST 10-20 : list lines 10 to 20

Waitasecond, 10 minus 20? Shouldn't that list line negative 10? And why the special syntax for LIST, when the comma is already widely used for separating parameters? Well, here's the Atari BASIC way:

LIST 10,20

Now you have to admit, that makes a lot more sense.

Another oddity of most BASICs was that some commands in the language could be used inside a program or outside, but not both. For instance, here's one that can be used either way:

PRINT "hello world"
10 PRINT "hello world"

But the LIST command could only be used outside the program in "immediate" mode. This was legal:

LIST 20

whereas this was not:

10 LIST 20

Atari BASIC removed this limitation, all commands could be used in immediate mode (no line numbers) or inside a program (add line numbers). This allowed me to win those "write the shortest program that generates itself as its output" with this simple program:

1 LIST

Actually you could make it even shorter. Many BASICs allowed short forms for some of the commands, but the Atari version provided a cleaned up version for any command. In English we shorten words into abbreviations with a period (Doctor becomes Dr., abbreviation becomes abbr.) and Atari BASIC followed this principle. An even shorter version of the program above is:

1 L.

More than a little bit of thinking was put into this. The actual logic when decoding commands was to read down the list of keywords alphabetically until you found a match, or spit out a syntax error if there wasn't one. They then shuffled the list of commands and placed the most used commands closer to the top of the stack.

For instance, you tend to use PRINT a lot more than PLOT. So they put PRINT at the top of the list of commands starting with P, so P. would mean PRINT. What's the short form for PLOT then? PL..

They even shuffled REM to the very top of the stack, so the short form of REM was a single period. Now a single period is no shorter than the single quote that MS used for this purpose, but this system added the short form of REM without adding any special syntax or code.

Another one of those examples of removing limitations that adds power is Atari BASIC's handling of jumps and branches. In most BASICs line numbers are hard-coded constants in your program, so if you want to jump to another line it would be something like this:

1105 GOTO 2500

There are often times where you want to go to one of a number of lines depending on some user input. For instance, you'll want to jump to different parts of your program depending on what item they select from a menu. In order to support this BASIC added a FORTRAN-inspired command called ON, which was used in this way:

1105 ON X GOTO 2500,2600,2700

What this means is that if the value of X was 1, go to line 2500, if it's 2, go to 2600, etc. Atari BASIC simply removed the original limitation and allowed you to refer to line numbers as constants or variables. For instance, the above ON line could be handed with:

1105 GOTO 2500 + (100 * X)

This is a far more flexible system than it looks at first glance, because it could also be used to give different parts of the programs names. So instead of GOSUB 10050, you could use the much more obvious GOSUB CLEANSCREEN.

This even had another advantage. A numeric constant would always take up 6 bytes of memory, but a string representing that constant only took up one byte for each character in the variable name. So if you had a routine you were jumping to from many places in the program, you could replace that with a single character variable and save quite a bit of memory. For instance:

110 GOTO 2500
[more lines]
1105 GOTO 2500
[more lines]
1510 GOTO 2500

Took up 18 bytes to store those line numbers; three instances at six bytes each. However re-coding this using variables like this:

1 L=2500
110 GOTO L
[more lines]
1105 GOTO L
[more lines]
1510 GOTO L

This took up only ten bytes, four bytes for the four instances of the variable L, and six more for the one instance of the number 2500. In some programs this could add up to a signficant savings, and with a total of 32k or so free RAM to work with this was certainly a useful technique.

Finally Atari BASIC had another useful difference from other BASICs; it gave you error reports on the fly. In MS BASIC you wouldn't find errors until you tried to run the program, but Atari BASIC did it as soon as you hit return on a particular line. The advantage is that any syntax errors were instantly identified, which made it a lot easier to know what was going on. You didn't have to LIST the proper part of the program and figure out what it was doing, you were already in the middle of thinking about it when the error occured.

These aren't the only examples, but you can get the idea. In general Atari BASIC programs were much cleaner and easier to read, which they gained by having the language do less work.

Atari BASIC, the Bad

So far so good: the new BASIC has simplified syntax and removed barriers. But now it falls flat on it's face. You see, Atari BASIC had no strings. Well OK, that's being a little pedantic, it did have strings, but they were more like C's char arrays.

Under MS you could make strings on the fly, change their lengths, add them together, etc. But in the Atari version you had to predefine all of your strings using the array syntax, and the result was fixed-length. It wasn't a complete loss – there wasn't any need to do garbage collection, and you could make them any length you wanted – but it was still a complete pain in the ass.

What made it really annoying was the syntax for using them. The MS system used a set of three functions to slice up strings:

LEFT$(A$,10)
RIGHT$(A$,10)
MID$(A$, 10, 20)

The Atari version was different, and in some respects a lot simpler:

A$(10,20)

That's it, that single "slicing" command could be used to do all of the things you could do with the three MS commands (although that's true of the MID$ function in MS as well). This might be seen as another one of those generalizations that actually helped out the language. However there's a couple of subtle points that need to be considered.

One is that the "starting point" of the strings were different, so if you were trying to get the leftmost ten characters of a string you had to get 1 to 11. The other difference was that the slicing was based on two absolute positions instead of a position and a length like the MS commands.

I always wondered why the Atari version didn't include "cover" methods like MID$ that simply rewrote themselves into the slicing commands. Then at least the code would port more easily. When I was talking to one of the authors he mentioned:
What we *COULD* have done (and maybe even shoved into the original 8KB cartridge, given a few more weeks) would have been to at least given you the Left/Mid/Right functions.
So there you go.

Those might not sound like big differences, but they were. Consider the the last MS example above. To properly convert that into the Atari version you'd have to use A$(11,30). So every time you came across a string in one of the programs you were typing in you had to stop and think about it. It wasn't just that the Atari version was harder to work with, it just didn't feel like the MS version.

Worse, Atari BASIC arrays were single dimensional. So whenever you saw a program that used an array of strings in its MS version, you were basically sunk. Of course you could work around these problems, but it always meant spending a lot of time trying to grok what the original version was trying to do and then coming up with an entirely different way to do it.

For instance, in one adventure game I was converting the game constructed an array of strings, each one containing the description for a room. Then the game could print out a room description by simply referring to the string in the right "room number":

1020 PRINT A$(R)

For the Atari version I had to re-code the entire program using this solution:

1020 GOSUB 10000 + (R * 10)
[...many more lines...]
10010 PRINT "room one": RETURN
10020 PRINT "room two": RETURN

[etc.]

Note the use of the "line number" math I mentioned earlier. While I was quite proud of this solution I'd have rather not done it at all.

Atari BASIC, the Ugly

Atari BASIC, for all it's good and bad points in the language itself, has the added distinction of being what is possibly the world's slowest BASIC. Some parts of the language were quite fast, and in other cases the performance scaled better that other BASICs in certain ways. But the long and short of it was this: if you typed in a program out of a book, it would almost certainly run slower on the Atari. Sometimes, a lot slower.

10 ' Ahl's Simple Benchmark
20 FOR N=1 TO 100: A=N
30 FOR I=1 TO 10
40 A=SQR(A): R=R+RND(1)
50 NEXT I
60 FOR I=1 TO 10
70 A=A^2: R=R+RND(1)
80 NEXT I
90 S=S+A: NEXT N
100 PRINT ABS(1010-S/5)
110 PRINT ABS(1000-R)

Back in those days Creative Computing would often run a sidebar showing times for various machines running a tiny little program David Ahl wrote. I would hang my head in shame every time I opened it to that page, because the Atari would be very near the end of the list. It would take about five and half minutes to complete a run that took other BASICs on the same machine less than half that time, and DR BASIC on CP/M only 16 seconds!

How did it get so slow? Basically because of two problems – a poor implementation of line number lookups in loops and jumps, and a poor implementation of multiply and divide.

The line number issue is a simple one. When any of these interpreters wants to branch to another line they have to look up the line in the program. In theory all of these could be sitting fixed in memory and rewritten in terms of "go back 10 locations", but remember that BASIC is an interpreter and the machine language code only exists for the specific line you're working on. There is no "10 locations back". Instead when the branch occurred the interpreter examined the line number, then started looking through the program code for that line number.

Commodore BASIC optimised GOTO slightly. If the destination line number was greater than the current line number, it would start the search from the current location in the program instead of starting at the top.

All BASICs did this, and for most branches speeding up this search would only improve performance a little bit because the operation doesn't take that long and there's usually not too many of them in overall terms anyway.

However there is one special case where the performance of the branch becomes important, and that's during a for-next loop. When the loop reaches the NEXT statement it has to branch back to the corresponding FOR. Since loops do this over and over again, a little performance gain here can mean a vast improvement in overall speed.

MS BASIC handled these branches by pushing the address of the "tokenized" line on the stack and leaving it in memory. Atari BASIC didn't do this, it used the same branch code as the GOTO, and as a result it did this linear search for the return location over and over again. This performance hit quickly added up, and for-next loops were strewn throughout the average BASIC program.

The other big problem was in the BCD (Binary Coded Decimal) floating-point math routines. This is the code that handles all of the math in a BASIC program, from adding two numbers, to the more complex trig functions and such. Almost all BASICs had to roll their own version of the floating point code, but again the timing was right and SMI was able to use off-the-shelf routines from Fred Ruckdeschel's BASIC Scientific Subroutines.

This code was meant to be ported to practically any platform and so it made few assumptions about the machines. That is, it wrote almost all of the code for the various functions using other functions inside the library. The exceptions to the rule were the most basic functions of all, it assumed that either the hardware or the implementation team would provide their own versions of add, subtract, multiply and divide. These functions tend to be very hardware specific and couldn't be generalized to any great degree in the book.

Other BASICs approached this same problem in different ways. The original Apple BASIC did everything using integers. That's fast, but a huge price to pay. MS BASIC included syntax for integer math with the A% notation, but this actually called into their existing floating-point code. There's a story there too, I'm sure.

Well of course the 6502 didn't provide multiply or divide, so SMI had to roll their own. And the person they got to do that particular task was new to programming on the 6502. Now when you're writing code that's going to be used throughout the system and has a direct impact on overall performance, you really want to put the A-Team on it. Oh well!

So take another look at the benchmark code above and you'll see why the Atari did so poorly. If it's not a for-next loop, it's a math function. Had the benchmark also done a little string processing the Atari would have shot up the list. But as it was it made the entire machine seem dog-slow, hiding the fact that it was roughly twice as fast as anything out there.

Why didn't it get fixed?

If the performance could be improved by two basic tweaks to the system, why didn't it ever happen? Well that's a good question.

The overriding issue was that the BASIC had to fit into an Atari cartridge, which was an 8k ROM. Failing to make this happen with MS BASIC made Atari turn to SMI in the first place. Even SMI apparently had problems with this (6502 code is "big"), and in the end some leftover space in the machine's own ROM was used to store the math routines. More room to work in means more room for tricks that speed up the code, and there's little doubt in my mind that a 16k cart would have allowed SMI to make a world beating BASIC for the machine.

Now this is interesting all on it's own. When the machines were first released they came in two models: the 400 and it's bigger brother the 800. One of the few distinguishing features of the 800 was that it contained two cartridge slots, so a total of 16k of ROM could be plugged in.

You can probably guess how this turned out. Since the "right cartridge" was available only on the 800, no one used it because they were afraid of being locked out of all of those 400's. There were maybe five carts in total for it. Eventually Atari just left the slot out of newer machines because no one was using it. This makes the whole story all the more sad and frustrating.

The next point of consideration are those floating-point routines stuck in the machine's ROM. The ROM was upgraded several times as new models came out, so it would have been possible to provide a better multiply and divide during any one of these upgrades. This would have the added advantage of making all BASIC programs run faster on the newer models, which would really differentiate them from their older cousins and be one more reason to upgrade. But they didn't.

And then finally there's the code on the cartridge itself. There were any number of runs of the cart made, and later it was built onto the motherboard of newer machines. Not only was the cart never upgraded, it wasn't supposed to be out in the first place. The version of BASIC that went out with the machines (later known as "Version A") was in fact a beta from SMI, but Atari went ahead and burned that onto the ROMs. What's painful about this in particular is that SMI identified and fixed a number of bugs (some serious) that cropped up in that release, but Atari couldn't even be bothered to change to using the bug free code!

I assume this is because Atari just didn't care about the BASIC. They were a consumer electronics company – no, a games company – and I think they just didn't get it. Sad really.

Other BASICs on the Atari

Now of course if the problems are that easy to fix you'd expect someone to fix them. You'd expect right. There were any number of such BASICs on the machine, and they were universally faster than the original. One such early attempt at speeding up Atari BASIC simply replaced the floating-point routines with integer routines. Apparently it smoked.

The primary problem facing any new BASIC was the same one that SMI and Atari faced, getting it onto a 8k cartridge. Most would ignore this limitation and instead ship on a floppy disk. But as time went one and people started learning more about the platform, eventually you started seeing cartridges with more memory.

Soon after the Atari shipped, the owner of SMI decided to go back to a single-person shop and the company dissolved. One of the employees bought the rights to all of the Atari code and set up Optimized Systems Software, or OSS. OSS's first product was a bug-cleaned version of Atari BASIC called BASIC A+ that shipped on disk. OSS would also ship a vastly improved BASIC XL on a 16k "supercartridge" (which bank-switched in two 8k blocks), and finally BASIC XE with an additional 11k of extra code on disk. BASIC XE is likely the fastest BASIC of any 8-bit machine, in general it was about four times as fast as the original.

By the "late days" of the 8-bit history there were any number of BASICs for the machine. Atari eventually shipped a pure MS version on disk, and later even managed to get it onto a cartridge. Others were shareware like TurboBASIC. Not only did it correct all of the problems and add everything under the sun, it was incredibly fast and even included a compiler for more speed. In addition there were many ROM upgrades on the market and their most popular feature was to fix the math routines.

Of course none of these replacements would ever really catch on. They simply weren't the BASIC on the machine.

Odds and ends

There was one rather famous bug in Atari BASIC called the "two line lookup". If you managed to get this to happen, you needed to reboot the machine. Now this was pretty infamous because in pretty much every other case of a lockup the "System Reset" button would do a warm-book and you wouldn't lose a thing – not even the program you were typing in.

The cause for the bug, and the solution, was clearly identified in a book by the authors of the original BASIC. It was a fairly simple bug in some library code used for moving memory around on the 6502. Apparently this bug cropped up as a side effect of a mistaken "cut and paste" between two sections of the library, the one that added room for new lines, and the one that compacted memory when removing lines. Under a certain condition the code would skip over 256 bytes during compacting and ruin the program.

Three years later Atari was releasing a new machine called the 1200XL and decided to fix this problem while they were at it. So they found the bug in the code and removed it. But at that point it appears the programmer in question looked at it and said "hey, this code looks just like the add a line to the program code, so I'd better fix that too" and thus reintroduced it in another location!

Now ask yourself this: which do you do more often, add lines to a document, or remove them? This was just one more problem in the disaster that was the 1200XL, yet another thing to be fixed on the follow-on machines.

Notes and Attributions

Many of the "inside details" were provided by Bill Wilkinson, who was one of the original authors of Atari BASIC (he also wrote the original specification). Additional details on the history were taken from his book, The Atari BASIC Source Book.

Various technical errors and details of the other BASICs provided by John McKenna.