RPN Calculator in PostScript

At first sight, implementing an RPN calculator in PostScript should be very straightforward. Expressions in PostScript are themselves RPN, and strings can be evaluated as programs. What can be simpler than reading lines and directly evaluating each one? In reality, however, we have to parse the input and ensure that it is syntactically correct. Besides, +, -, *, and / are expressed as add, sub, mul, and div in PostScript, so the corresponding substitutions must be made in the text before evaluating it. Unfortunately, text manipulation is only barely supported in PostScript; in particular, there is no direct means to split a string into non-blank tokens. Moreover, it is impossible to read entire lines of unbound length. All this means that, despite the apparent structural similarity between the RPN expressions and the implementation language, our original expectation for a very short solution of the problem cannot be satisfied.

The rpn procedure reads characters one at a time and places them on the data stack until an end-of-line character (10) is found. Spaces (32) and tabs (9) are dropped, while series of non-blank characters are converted into tokens by calling mktoken. (As there is no character datatype in the language, all characters are represented as integers – their ASCII codes.)

The portion of the data stack that rpn controls consists of three parts. At the bottom is the series of tokens read so far from a line of input. Next follows a (currently unfinished) series of characters that will eventually comprise a new token. The token series and the character series are each preceded by a mark value ([), which makes it convenient to collect the latter kind of series into a string and the former into an array. At the very top are copies of the two most recently read characters. They are used to tell where a non-blank series starts or ends. On starting such a sequence, a mark value is put in front of it. When the series ends, it gets converted into a token and thus extends the sequence ot tokens.

When rpn sees an end-of-line character, it calls evalrpn to evaluate the RPN expression accumulated on the data stack as a sequence of tokens. The first thing evalrpn does is create an array of the tokens it finds. If that array is zero-length, the input line was apparently all-blank, and evalrpn returns an empty string: (). Otherwise, through the cvx operator, the array is converted into an executable object – a procedure – and thus executed. The execution takes place in the special context of a stopped operator, so that if there turns to be an error in it (which will happen when an arithmetic operation runs out of arguments), stopped will produce true on the top of the data stack. Also, we explicitly, by using stop, force it to do so when the computation leaves more than one value on the stack, meaning that there were insufficiently many operations in the RPN expression. Finally, evalrpn checks the outcome of the stopped operator, and either ‘returns’ (i.e., leaves on the data stack) the result of the computation or re-signals an error using another stop. In both cases, some clean-up is done on the stack.

evalrpn itself is executed within a stopped operator, so that rpn can check its outcome within if and, when an RPN expression computation ends in error, generate the string (error) to be output in place of a numeric result.

rpn then checks what is left at the top of the data stack. If it is non-integer and not an empty string, it must be either a real-type number that resulted from a computation, or an error message (a string), so it gets printed, using =. (flushing the buffer of the standard output is necessary, as otherwise the actual printing will be put off instead of taking place right at the moment.) Seeing an integer, which is really the value 32, means that, at that point, the expression is still incomplete (the currently read character was a space or a tab, both coded as 32 here), so nothing is to be output and reading the current line continues.

The mktoken procedure, used to form the tokens of which the expression is made, first calls mkstring to create a string of the characters just read and kept on the data stack. It then tries, using cvr, to parse that string as a number. If that fails, an attempt is made to find a key equal to the given string in the dictionary op. A success would mean that the string was an arithmetic operator, and the PostScript operator corresponding to it is obtained from op and produced as a result of mktoken.

Checking for the string reading as a number and accessing the op dictionary are each done within a stopped. If both fail, mktoken explicitly, using a stop, signals an error, but before that it reads the current input line to the end and clears the data stack of possible tokens, so that the calling procedure, rpn, could start reading a new line in a clean state. Note that the call to mktoken is within the same procedure, called by stopped, where the call to evalrpn is. That way, all sources of syntax errors, be they incorrect tokens or incorrect sequences of tokens, are handled uniformly.

/f (%stdin) (r) file def

/rpn {
  [ 32
  { f read not {cleartomark exit} if
    dup 9 eq {pop 32} if
    dup 32 eq 1 index 10 eq or {
      { exch 32 ne {mktoken} if
        10 eq {evalrpn} {32} ifelse
      } stopped {(error)} if
      dup type /integertype ne {
        dup () ne {= flush} {pop} ifelse
        [ 32
      } if
    } {
      exch 32 eq {[ exch} if
    } ifelse
  } loop
} def

/evalrpn {
  ] dup length 0 eq
  {pop ()} {
    [ exch {cvx exec counttomark 1 ne {stop} if} stopped
    {cleartomark stop} {exch pop} ifelse
  } ifelse
} def

/mktoken {
  counttomark 1 add 1 roll mkstring
  {cvr} stopped {
    {op exch get} stopped {
      pop pop {10 eq {exit} if f read pop} loop
    } if
  } if
} def

/mkstring {
  dup string
  exch 1 sub -1 0 {1 index exch 4 -1 roll put} for
  exch pop
} def

/op << (+) /add cvx (-) /sub cvx (*) /mul cvx (/) /div cvx >> def