Martin Richards's BCPL Reference Manual, 1967

I dug up a copy of an old BCPL manual. Designated Memorandum M-352 of MIT Project MAC, it is dated July 21, 1967. Thanks to the kind permission of its author, it is available for study.

Contents

Presented here is a (186KB) Postscript or PDF (smaller at 41K) version of the edited result of OCR of the document. A gzip-compressed version of the Postscript at 31K is available as well. The OCR was done by the Textbridge package; editing and Postscript output was done by me with Word 7.0. For the scholar, there is also a big (2.7MB) TIFF representation of the scanned image, so you can check the transcription.

The paper is a xerographic copy of a typewritten memo. In (copied) handwriting at the top of the front page are the initials PGN, for Peter G. Neumann, and at the lower right is the notation "Copies to [Ken] Thompson, [Doug] McIlroy, [Bob] Morris," evidently Neumann's distribution list.

It was a bit resistent to OCR because some of the characters were evidently inserted by hand, particularly a section symbol (those funny stacked S's that might print as § in your browser) used in its "publication" syntax to indicate the start of a block. The corresponding symbol that closes a block is a section symbol with a vertical overstriking line. In the Postscript redaction, I represent it by an understruck section symbol. Likewise, certain operators like not-equivalent are not present in the available font; they're italicized in square brackets. And I used those same brackets to describe a couple of diagrams whose rendition is omitted.

The document was also marked up somewhat in colored pencil. Some of the notations are apparently notes, probably by Rudd Canaday. Some seem to be "check off" marks on bits of the syntax, as if he (or someone) was comparing code to the language definition.

The Postscript version adheres fairly closely to the layout of the original except that I did not try too hard to fiddle the indentation to correct the mysterious beliefs and conventions of Textbridge and Word. Word division in running text is not preserved, and line-spacing is only approximated. Page divisions are preserved.

Besides errors that I introduced in the editing, there are also a few errors in the original, some doubtless introduced by the original typist. For example, on p. 4, a comment introduced by // spills across two lines. I generally resisted amending the original, except that in a few places an obviously-meant "0" has been substituted for a lower-case "o".

Context

BCPL has had a productive life of its own, but my interest in it is more in the basis it provided for the development of the B language and then in the history of C. This influence is traced, with gratitude, in the latter paper. As it tells, the BCPL language definition when B was developed was just the manual recorded here, together with the compiler that Richards contemporaneously wrote for the CTSS system.

BCPL changed somewhat from that time; its definition became clearer and the language more useful. But in observing the divergences, it's useful to record the precise point on the stem from which the B and then the C branch sprouted. Martin Richards's home page has links for obtaining a current version of a compiler and examples of the language.

The modern version of BCPL is described in the book by Richards and Colin Whitby-Strevens, "BCPL--the language and its compiler," ISBN 0521286816. The big on-line booksellers differ on their views of its availability. (Amazon says out of print, Borders says special order, B&N says available in 24 hrs, Blackwell's says available to order.)

This early manual describes (or perhaps one might say, "alludes to") the lexical representation of real programs in three ways. There is a canonical representation using uppercase words like NUMBER, NAME, COND, SECTBRA for terminal symbols of the grammar. The syntax as given, however, uses a sort of "publication style," rich with symbols and underlined keywords; it recalls Algol 60 with its publication language intended for journals, but in practice punched on cards, paper tape, or available terminals with a less-rich symbol set. This third style, the one actually used in the compiler on CTSS, is only hinted at and but briefly illustrated in the examples in the document.

These lexical details changed at various times both in Richards's own usage and in our rendition of his compiler at Bell Labs. Richards's first BCPL compiler was written using the 6-bit BCD character set on CTSS, and adapted shortly thereafter to use the characters available on the IBM "golf-ball" 1050 and 2741 terminals. At about the same time, CTSS was beginning to adapt to the ASCII character set for Multics development.

Some of the lexical conventions actually used in early BCPL were directly adopted into B; some more recent ones may owe to back-influence from C. For example, the SECTBRA and SECTKET symbols, written as the section-sign and overstruck section-sign in the manual's syntax, were already represented as $( and $) even in the compiler that Richards wrote on CTSS. In some, but not all, of the example programs that Richards provides today, the canonical symbols are written {} as in B and C.

Similarly the operators with formal name LOGAND and LOGOR were represented in this manual as the propositional calculus characters resembling /\ and \/, but were actually typed (in the compiler of 11 Sept. 1967) usually as keywords logand and logor. However in some parts & was already used for LOGAND. In B they definitively were & and |. They are now visible too in modern BCPL, and almost certainly were adopted very soon into barely-past-1967 BCPL along with other obvious representations like > < for the gt lt used by and accepted by the compiler in 1967. But the particular notebook I have, and the manual and compiler listing I have, date between July and November 1967 and do not record them yet.

Important semantic issues remained unresolved in the 1967 manual, for example the actual meaning of the LOGAND and LOGOR operators. The compiler in 1967 distinguished between & | used in `truth-value' contexts and ordinary value contexts; in the first context they were handled as sequential tests, in the second as bit operations. The manual doesn't talk about this. The difficulty of explaining the situation led, several years later, to the separation of the C && || operators from & |.

Some of the other lexical or similarly low level changes that happened over the years include