PLACE |

spec |

solutions |

BCPL |

Here is a rather lengthy implementation of the calculator. BCPL offers no convenience for string processing, and supports no floating-point or other non-integer arithmetic. Parsing the input, performing numeric calculations, and output formatting of numbers are programmed from ground up, using integers and vectors of ones. Numbers are represented as common fractions, and the respective arithmetic is implemented rather simple-mindedly; certain inputs will lead to erroneous results due to loss of significance or integer overflow, which a less naive approach would be able to avoid.

The language dictates that the main procedure is named `start`

. It was my choice to nest all other procedures within it.

Two of them do most of the work. `readln()`

reads a line of text from the standard input and breaks it down into tokens which it stores in the vector `tks`

, while `evalrpn()`

evaluates the RPN expression by traversing the token sequence backwards, starting from its end. All kinds of syntax errors are catched within `readln()`

, so that `evalrpn()`

only has to do the computations proper.

To read and parse the input, `readln()`

makes use of a number of locally declared procedures. At the lowest level, `getch`

is called to read characters one at a time. The actual reading is done by the library function `sardch()`

, and tab characters are transparentized by changing them to spaces. (`*`

is the ‘escape’ using which special characters are typed, so that e.g. `*t`

and `*n`

designate tab and new line characters.)

It ought to be noted that`sardch()`

is a recent adition to the language. In the classical BCPL programs,`sardch()`

's job used to be done by`rdch()`

, but the latter function currently returns immediately rather than waiting until a character is actually entered.

The function `gettk()`

makes sure that a token that it assembles from individual characters is either one of the four arithmetic operators or a number. It returns a Boolean value (Boolean constants being only special syntax for two distinguished integer values), indicating whether a token is a correct one. If it is, that token is stored, by calling `addtk()`

, in the token sequence `tks`

.

`gettk()`

delegates the recognition, representation and storing of numeric values to `getnum()`

, which also discriminates between success and failure by returning a Boolean. `getnum()`

computes a number by its digits and takes into account the position of the decimal dot if one is present. A successfully parsed number is represented as a pair of numerator and denominator. A fraction is always stored in an irreducible form which is being obtained by calling the `nrz()`

procedure. The said procedure exemplifies the use of parameters whose values are pointers, so that the values referred to can be set in the procedure. The technique is similar to that used in the BCPL's descendant C, but in BCPL, `@`

gets a pointer, and monadic `!`

dereferences one.

The counter `tcnt`

is initialized in `readln()`

and maintained within `gettk()`

in order to track the syntactic correctness of the token sequence as an RPN expression. Upon completion of reading a line, `readln()`

does a final check (`tcnt=1`

) to ensure that the entire expression is correctly formed.

The `tks`

vector gets an initial size (whose value is unimportant) as soon as the program starts, and is being extended whenever the number of tokens exceeds the current size. In fact, a new vector is created, where the contents of the exhausted one are copied, the old vector is removed, and the value of `tks`

is pointed at the start of the new vector. The 0-th element of the token vector is used throughout the program for storing the actual volume in `tks`

occupied by tokens. Yet another element – the one at offset -1 – automatically (by language definition) stores the size of the vector as allocated. This helps keeping the entire information regarding the use of `tks`

in a single data object, `tks`

itself.

Each token occupies two consequtive elements of `tks`

. For a numeric token these are the numerator and the denominator, while for an operator one of the elements stores a code of 1, 2, 3, or 4, and the second element is 0. That makes possible to tell the kind of a token at any position without need for additional information.

`readln()`

returns to the main program one of the values defined as manifest constants of the program, signifying end-of-input (`END`

), normal return with a correct expression that is to be evaluated (`RES`

), empty input line (`EMP`

), and a syntax error (`ERR`

). Each case is treated accordingly, in particular, `evalrpn()`

is called as needed.

Each call of `evalrpn()`

decreases by 2 (one token) the value stored at the 0-th subscript: this way the function keeps track of the location of the current sub-expression to be evaluated – it always ends where the entire expression ends according to `tks`

's 0-th item. Also, each call of `evalrpn()`

returns a pointer to within `tks`

where the result of the (sub)computation that that call performs is stored. When the sub-expression that is being evaluated is a single number, the pointer to that number is returned. When it is (i.e., it ends at) an operator, that operator is eventually replaced by the result of applying it to its operands, and a pointer to it is returned. All such results are, again by calling `nrz`

, stored as irreducible fractions.

After the main program obtains the result from `evalrpn()`

, it does some computations in order to print it as a decimal fraction with at most four digits after the decimal dot: first it prints the integer part, then scales and rounds up the fractional part. If the fractional part appears to end in some number of 0s, those 0s are removed. This may lead to removing the whole fractional part along with the decimal dot.

`CASE`

statements in BCPL are similar to C's (or vice versa) in that they normally have to be exited explicitly, because otherwise the execution passes to the next clause, which is usually not wanted. One way to exit a `CASE`

is through `ENDCASE`

, as in `evalrpn()`

. In one place of the main program, namely at `CASE RES:`

, we do allow control to reach the body of the next clause, because we indeed have to execute `LOOP`

, i.e., jump to a next iteration of `REPEAT`

.

Most local variables in the program are dynamic, stack-based. Such variables, similarly to procedures, are declared with the `LET`

keyword and must be given an initial value. Whenever the initial value is immaterial, the indeterminate (`?`

) pseudo-value comes in handy. Several variables local to `start()`

and `readln()`

are declared `STATIC`

. This is so that the procedures nested at the same or deeper level as those variables are able to access them (BCPL prohibits referring to dynamic local variables of outer procedures).

Finally, several remarks on BCPL's syntax are in order.

As this program demonstrates, curly braces `{`

and `}`

are used as block delimiters. However, the braces are a recent adoption. The more traditional pair, which is still accepted, is `$(`

and `$)`

. These character combinations were chosen back in the 1960s when many computer keyboards had neither braces nor square brackets. At that time, there was also a separate pair of characters for printed representation of BCPL programs: `§`

(section character) and the same character vertically overstroke. The `§`

is a part of the syntax heritage of BCPL from CPL.

That the vector and string subscript operators are single characters (dyadic `!`

and `%`

) rather than a pair puts BCPL in the very small group of languages having this property. Most other languages introduce unnecessary syntactic idiosyncrasy by forcefully parenthesizing subscript expressions.

The construct `…->…,…`

is the conditional expression known to C, C++ etc. programmers as `…?…:…`

.

One may observe that the bodies of `IF`

, `TEST`

, `FOR`

, and other statements are preceded by a `DO`

keyword, but not always. The rule is that `DO`

may be omitted if the immediately following token is a `{`

or a keyword. I tend to find this convenient and like to make use of it.

GET "libhdr" MANIFEST {RES = 1; EMP = 0; END = -1; ERR = -2} LET start() BE { STATIC {digs; ops; tks} LET nrz(x,y,a,b) BE TEST x=0 DO !a, !b := 0, 1 OR { LET s, t = ((x>0)=(y>0) -> 1, -1), ? x, y := ABS x, ABS y !a, !b := x, y IF x<y DO t, x, y := x, y, t UNTIL y=0 DO t, x, y := x MOD y, y, t !a, !b := s*!a/x, !b/x } // nrz LET readln() = VALOF { STATIC {tcnt; ch} LET getch() = VALOF { LET c = sardch() RESULTIS c='*t' -> ' ', c } // getch LET index(c,cs) = VALOF { FOR i=1 TO cs%0 IF c=cs%i RESULTIS i RESULTIS 0 } // index LET gettk() = VALOF { LET sgn, o = 1, ? LET addtk(v,va) BE { LET sz, m = tks!-1, tks!0 m := m+2 IF sz<=m { LET p = getvec(2*sz) FOR i=1 TO sz DO p!i := tks!i freevec(tks) tks := p } tks!0, tks!(m-1), tks!m := m, v, va } // addtk LET getnum() = VALOF { LET r, d, p, i = 0, FALSE, 0, ? { i := index(ch,digs) IF i=0 BREAK TEST i<=10 { r := 10*r+i-1 d := TRUE IF p>0 DO p := 10*p } OR { IF p>0 RESULTIS FALSE p := 1 } ch := getch() } REPEAT UNLESS d & (ch=' ' | ch='*n') RESULTIS FALSE IF r=0 | p=0 DO p := 1 {LET a, b = ?, ?; nrz(r,p,@a,@b); addtk(a,b)} RESULTIS TRUE } // getnum // begin gettk o := index(ch,ops) IF o>0 { ch := getch() IF o>2 | index(ch,digs)=0 { tcnt := tcnt-1 TEST tcnt>0 & (ch=' ' | ch='*n') {addtk(o,0); RESULTIS TRUE} OR RESULTIS FALSE } IF o=2 DO sgn := -1 } UNLESS getnum() RESULTIS FALSE IF sgn=-1 DO tks!(!tks-1) := -tks!(!tks-1) tcnt := tcnt+1 RESULTIS TRUE } // gettk // begin readln tcnt, ch := 0, ' ' { WHILE ch=' ' DO ch := getch() SWITCHON ch INTO { CASE endstreamch: RESULTIS !tks=0 -> END, ERR CASE '*n': RESULTIS !tks=0 -> EMP, tcnt=1 -> RES, ERR DEFAULT: UNLESS gettk() { UNTIL ch='*n' DO ch := getch() RESULTIS ERR } } } REPEAT } // readln LET evalrpn() = VALOF { LET n = tks+!tks !tks := n-tks-2 IF !n=0 { LET q = evalrpn(); LET c, d = !(q-1), !q LET p = evalrpn(); LET a, b = !(p-1), !p SWITCHON !(n-1) INTO { CASE 1: nrz(a*d+b*c,b*d,@p,@q) ENDCASE CASE 2: nrz(a*d-b*c,b*d,@p,@q) ENDCASE CASE 3: nrz(a*c,b*d,@p,@q) ENDCASE CASE 4: nrz(a*d,b*c,@p,@q) } !(n-1), !n := p, q } RESULTIS n } // evalrpn // start of the main program digs, ops := "0123456789.", "+-**/" tks := getvec(101) { !tks := 0 SWITCHON readln() INTO { CASE END: BREAK CASE RES: {LET r = evalrpn(); LET n, d = !(r-1), !r writen(n/d) n := muldiv(n MOD d,100000,d) d, n := n MOD 10, n/10 IF d>4 DO n := n+1 MOD 10000 FOR i=1 TO 3 IF n MOD 10 = 0 DO n := n/10 IF n>0 {wrch('.'); writen(n)} newline()} CASE EMP: LOOP CASE ERR: writes("error*n") } } REPEAT }

boykobbatgmaildotcom