PLACE
languages
RPN in BCPL

BCPL

BCPL was designed and implemented for the first time in 1966-7. It was conceived as a simplified version of the much more ambitious CPL (BCPL stands for ‘Basic CPL’) and as a language in which the latter was to be implemented.

A more general goal set before BCPL was to serve as a practical, efficient, compact, and easily portable tool for system programming – and that was achieved. For example, as reported by its author M. Richards, ‘[…] reimplemented in BCPL, the first compiler was a mere 1000 lines long’. Compilers were known to run within 20kB of RAM or even less.

Language Features

The most notable characteristic of BCPL is that it is typeless, or, perhaps more properly defined, it has a contextual typing. That means that the only kind of primitive data value in it is the contents of an abstract memory cell – a machine word, which has no type by itself, but, depending on the operators used to process it, can be interpreted as a bit pattern, an integer, a character, a Boolean, or a pointer. To this end, the language provides the programmer with operators for doing integer arithmetic and comparisons, bit sequence manipulation, Boolean arithmetic, and pointer manipulation, such as obtaining a reference and dereferencing. No checking is performed to determine whether an actual value can be subjected to a particular operation.

In any particular implementation, all cells are the same size in bits, which is available to the programmer as a system-provided constant.

No discrimination is made with respect to signedness or size of integers – they are all of one kind. Some implementations used to extend the language by supporting floating-point arithmetic, denoting the corresponding operators by #+, #-, etc.

The only data structure explicitly supported in BCPL is the vector. A vector is a sequence of cells, and the order of the cells in it induces an order of the pointers to them: when interpreted as integers, pointers at adjacent cells differ by 1.

There is no notion of higher-order vectors or other data structures in the language, but a variety of forms of data organization can be modelled by storing pointers into vectors. Viewed this way, a vector is an arbitrarily large and complex structure that can be linked to other structures as well as internally. The current version of BCPL makes use of this by implementing ‘objects’ (as in object-oriented programming) through a simple convention for organizing the vector space, standardized use of names, and several procedures in the standard library.

Handling text strings is not one of the strengths of BCPL, but is moderately easy to attain. Besides storing string data a character a cell, characters can be placed each in a single byte. Because indexing of characters must be distinguished from that of cells within the same buffer, there are two different operators that implement the two ways of applying a subscript to a pointer. Thus, any cell or vector may be treated as a character buffer as well as a buffer for cell(s).

Text data can also be specified as string constants. The value of a string constant is a pointer to a statically allocated buffer storing the respective text. Despite the term ‘constant’, the pointer can actually be used not only to access the text but also to mutate it.

BCPL's block and control structure closely follows the one of CPL. In particular, BCPL is well equipped – better than most languages – with conditional and repetitive control constructs. For example, loop forms are provided governed by pre- and post-conditions, where those conditions can be stated both positively and negatively. A rare but valuable feature is that conditional execution and choice are represented by two separate constructs rather than, like in most other languages, having a single conditional statement with an optional second clause.

A block in BCPL is a generalization of either a command or an expression. In the latter case the block must be prefixed by VALOF and pass a value to its caller by executing a RESULTIS statement at some point. As the body of a procedure is normally a block, the same convention discriminates functions from procedures that do not return values.

Any block can contain local definitions of static or dynamic simple values or vectors, as well as procedures. Thus procedures can be nested, and simultaneously defined so that they can call each other regardless of the order in which they appear in the program.

Blocks can be labeled so that a closing mark with an appropriate label attached to it could possibly close several nested blocks at once.

Procedure names, when treated as values by themselves, are pointers to the respective code fragments. As any other value, they can be stored in individual cells or vectors, passed to other procedures or returned as function values. This way, in a procedure call, the procedure may be designated not only by name but by any expression, provided that that expression produces a meaningful pointer. Procedures cannot be created dynamically, though. One may notice that C's treatment of procedures is essentially identical to that of BCPL.

Interprocedural jumps are possible through calls of library pseudo-functions level() and longjmp(); this has been later carried to C (where setjmp() is used in place of level()).

An interesting concept in BCPL is the so called global vector. It is the only global data object in the language, and its contents are directly accessible from any part of the program. Its size, although not explicitly specified, is considered ‘sufficiently large’ so that the programmer can place in the global vector as many pieces of data as he needs to.

In order to access global data, one needs to know where in the global vector it is stored. Associations of names with locations in the global vector can be stored in ‘header files’. Inclusion of such files at compilation time ensures that the global data is shared across different compilation units. Storing pointers to external procedures as global data, as is e.g. the case with the standard library procedures, is a programmer controllable, dynamic linking mechanism. It makes it possible to load, replace, and extend procedures at run time.

BCPL's library of standard procedures is very small: it contains very little beyond basic input and output functionality and dynamic storage allocation and reclamation.

Around 1977, a simple but effective coroutine mechanism was added to BCPL, making the language even better suited for system programming.

Implementation, Portability, Usage and Influence

The portability of implementation was a serious concern from the very birth of the language. The solution developed by M. Richards was to define a virtual machine, named OCODE (‘O’ for object, as in ‘object file’), to which the program source could be translated instead of directly compiling it to machine codes. The OCODE representation of the program could then be compiled to a specific machine language or interpreted. This helped clearly separate the machine-independent from the machine-specific part of the BCPL compiler and thus immensely reduced the work needed to port the compiler to new machines. As the OCODE form of the compiler itself could be interpreted, OCODE could also be used for bootstrapping the compiler.

The greatly increased availability of BCPL compilers facilitated, in turn, porting of other system and application software.

This was the invention of the virtual machine concept and approach to implementing software. Virtual machines are now widely used in programming language implementation, and many of them have been very successful. The continuing work on the BCPL compiler lead to creation of other virtual machines as well.

An interesting aspect of BCPL is its intimate relation to TRIPOS, a portable operating system for small computers, developed in the second half of the 1970s also by BCPL's creator. (The word ‘TRIPOS’ stands for ‘TRIvial Portable Operating System’ but also relates to the Cambridge University where both BCPL and TRIPOS were born.) TRIPOS was both written in BCPL and served as an environment for developing software in BCPL. To an extent, the evolution of the language was motivated by the urge for a compact and efficient implementation of the operating system; the adoption of coroutines is an example of this.

The current, and perhaps the only available, as of 2010, implementation of BCPL, runs within an interpretive version of TRIPOS called Cintpos: BCPL programs are compiled to a virtual machine code and interpreted in it. The system is available for the most popular current paltforms (see the link below).

Based on BCPL, M. Richards created another language, called MCPL and borrowing features from C, Standard ML (pattern matching), and Prolog. Like BCPL, MCPL is available in an interpretive form from its author's home page.

K. Thompson and D. Ritchie, in search for a language suitable for rewriting the Unix operating system (initially written in an assembly language) arrived at designing B, which was basically BCPL with a new syntax. Further development by D. Ritchie eventually lead to C. Thus, although BCPL itself is not widely used now, C and the many languages that sprang from it have much in common with it. Specifically, C's attitude of giving the programmer an unrestricted power and not constraining him by the language clearly is one shared with, and inspired by, BCPL.

Links of Relevance:

A brief(ish) description of BCPL

Home page of M. Richards, the author of BCPL: interpreters and documentation for BCPL and MCPL can be obtained from here.

The BCPL User Guide

The original BCPL Reference Manual

The Xerox PARC BCPL Reference Manual

Comments from D. Ritchie, the author of C, and a recreation of the original BCPL Reference Manual

boykobbatgmaildotcom