RPN Calculator in SETL

For the implementation described below, I have chosen to restrict myself to using the original language as defined in the 1970s (see e.g. here). I thus had to write a somewhat larger program than I would have done using a more recent dialect of the language. Instead, I could have, e.g., made use of the split built-in procedure and of regexes as provided in this implementation of the language, for tokenization and for number parsing respectively, arriving at a program no more than half the size of the present one. On the other hand, having to write more myself gives me the possibility to show a bit more of the programming constructs specific to SETL.

The program consists of a loop and two refinements: blocks that, for convenience, are placed separate from the main body and referred to by names. The loop reads an entire line into s, calls the tokenize refinement to break it down into tokens and recognize the numbers among them, and if there are indeed tokens – the line was non-empty, non-blank – it is processed within an inner loop.

Note that the value of s is initially the string read by get but tokenize replaces it by a tuple of tokens. The inner loop searches that tuple for consecutive triples of two numbers and a string, and for each triple, if the string is an arithmetic operator, that triple is replaced by the result of applying the respective operator on the two numbers. The case expression chooses and applies the needed operation but produces an empty string when the string o is not such an operation.

This while loop is controlled by a quantified test of existential kind over the set of s's indices (not counting 1 and 2). That the set {3..#s} is chosen to iterate on, and not the tuple [3..#s] of the same elements is because we want to leave unspecified the (insignificant) order of picking up the operations to perform. Nevertheless, all possible computations do take place, and the initial tuple reduces to a single number if and only if a correct RPN expression was input. Correspondingly, the print built-in procedure outputs either the resulting number or an error message.

The tokenize refinement first replaces all tab characters in the string s by spaces, then forms a tuple of all its non-blank tokens which it stores back into s. Finally, each token that parses as a number is replaced by that number. Notable is the typical of SETL declarational style in which tokens are defined. The tuple former ([...]) in tokenize says: ‘extract all substrings s(i..j), i taking all meaningful values within s and j being no lesser than i, such that s(i..j) is surrounded by spaces but itself consists of non-spaces.’ (Putting in advance a pair of spaces arround s serves to ensure that indeed any token is preceded and followed by a space.)

The replace_if_number refinement is responsible for finding out whether an individual token represents a number, obtaining that number, and putting it in place of the token. It makes use of the val operator for extracting a numeric value from a string. If val's argument does not parse as a number, val produces om, SETL's special constant for unknown value. However, SETL's definition of a number denotation excludes leading arithmetic signs, as well as trailing decimal dots. Therefore, some preprocessing is needed to take care of these characters. An ending dot is simply dropped, provided there are no other dots in the string: the removal does not alter the value. A possible arithmetic sign is removed, but is also taken account of.

Strings and tuples are treated mostly the same way in SETL. A string, in this case tk, is acted upon by the same operators that SETL uses for tuples: fromb (for ‘from-beginning’) and frome (for ‘from-end’). Both operators simultaneously remove an item from a tuple and store it into their left argument. (The latter is not needed in our usage of the operators, although we do have to provide a variable for it.)

The program makes use of some syntactical variations allowed in SETL. The (doing get(s); while not eof) is short for loop doing get(s); while not eof do, and similarly for (while exists ...) and the two loops in tokenize. In end-delimited constructs the keyword end can be followed by a specific keyword, as in end if and end loop, but here we only make use of this for ending the entire program's body.

program rpnc;
(doing get(s); while not eof)
  if #s=0 then continue; end;
  (while exists i in {3..#s} |
         is_real s(i-2) and is_real s(i-1) and is_string s(i))
    [x,y,o] := s(i-2..i);
    s(i-2..i) := [case o of ('+'): x+y, ('-'): x-y,
                            ('*'): x*y, ('/'): x/y else '' end];
  print(if #s=1 and is_real s(1) then s(1) else 'error' end);

  (for i in [1..#s])
    if s(i) = char 9 then s(i) := ' '; end;
  s := ' '+s+' ';
  s := [s(i..j): i in [2..#s-1], j in [i..#s-1]
      | s(i-1)=' ' and s(j+1)=' ' and forall k in [i..j] | s(k)/=' '];
  (for i in [1..#s]) replace_if_number; end;

  tk := s(i);  sgn := 1.0;
  if tk(1) in {'+','-'} then
    if tk(1)='-' then sgn := -1.0; end;
    c fromb tk;
  if #tk>0 and tk(#tk)='.' and '.' notin tk(1..#tk-1) then
    c frome tk;
  x := val tk;
  if x/=om then s(i) := sgn*x; end;
end program;