RPN Calculator in Prolog

Here is an implementation of the RPN calculator in a mostly declarative style. The program is a set of predicate definitions. The special clause initialization provides an ‘entry point’ for the program, i.e., it specifies an instance of a predicate, rpn, as the goal of the program. The meaning of the program is satisfying that goal. The latter is accomplished by satisfying a number of other predicate instances, most of which require satisfaction of even simpler ones etc.

One can notice that only a very small number of built-in predicates is used. Of all making up the program, the four at the end – length, reverse, append, and member – can be regarded as ‘nearly standard’: although not in the ISO definition of the language, they are so commonly needed that are actually available in virtually any existing Prolog system.

rpn makes use of gets for obtaining a text line S from the standard input. gets produces a list of characters (a character in Prolog being an atom of length one), or the built-in constant end_of_file when the input is ended. An empty list is produced when the input line is empty or only contains blanks.

When S is non-empty, it is sent to tks for producing a list Ts of translated tokens – numbers and operators. Ts gets passed to expr whose job is computing the RPN expression. When both tks and expr succeed, the result Res is printed out. A lexical error, i.e., a part of S that does not read as a number or operator, makes tks fail. Similarly, expr fails if Ts turns out not to be a syntactically correct token sequence. If any of these happens, the atom error gets printed out instead of a numeric result. Distinguishing between success and failure is done through applying the conditional predicate …->…;… (of three arguments). nl is a built-in no-argument predicate printing a new-line character.

Although we speak of rpn, gets, and expr as just three predicates, actually there is more than one definition under each of those names. Recalling that two or more predicates sharing a name differ by their arity (number of arguments), there are rpn/0 and rpn/1; gets/1, gets/2, and gets/3; and expr/2 and expr/3. Similarly for getc and reverse.

It is a customary, though not universally approved practice, to write more general versions of a predicate under the same name by adding arguments. A more general predicate normally is subsidiary to one of lesser arity. For example, the meaning of gets(R) is ‘reading an input line succeeds by producing a result R’; gets(L,R) means ‘having already read L, reading the current line succeeds with a result R’; and gets(L,R,C)'s meaning is ‘if L is successfully read and C is the currently read character, then reading succeeds producing a result R’. The chief and subsidiary predicates are often mutually recursive.

For reading individual characters, Prolog has the get_char built-in predicate. gets, instead, makes use of getc which changes tab characters to spaces and leaves all others intact.

The list produced by gets is an amended version of the input line. It is a series of sequences of non-blanks, each sequence followed by exactly one space. The characters in the list are in the reversed order with respect to the actual input. Both properties help tks form the eventual token sequence more conveniently. The reversed order of the tokens in that sequence is also the natural one for expr to process – the latter does so by means of downward recursion on the RPN expression's structure.

Within tks, each character sublist to become a token is extracted using the append predicate. The latter is satisfied if and only if the concatenation of its first two arguments equals the third. The process of unification is useful for the possibility to bind variables to the lists or their parts. In this particular use of append this results in splitting the list S into a ‘head’ T and remainder R. The sub-list T gets passed to gettk to determine whether it forms a valid token and if so, Tk is that token. The remainder R is processed by recursively invoking tks on it.

For an arithmetic operator, gettk produces a single-character atom – the same character that it receives as its first argument. Recognizing an operator takes place through checking whether the list consists of a single character [T] and then whether T is any of +, -, *, or /.

Recognizing numbers requires more work. First, the original order of the characters in the respective list is restored by obtaining a reversed copy Tr of the list T. Then a possible arithmetic sign is stripped from its front; Sgn is set to 1 or -1, respectively. The remaining list Ta gets processed by use of numcan, and if it reads as a valid number, a possibly modified version Tc of Ta is produced. Finally, from Tc number_chars obtains the respective number.

numcan does its job by finding out whether there are digits in the list that is its first argument, whether there is (only one) decimal dot, and whether the list contains no other characters. If all this holds, the list is canonicalized by removing the decimal dot if it is in the very last position, or adding a 0 in front in all other cases. Thus a list starting or ending with a dot is guaranteed to be replaced by one that represents the same number but without a dot in an extreme position. Said kinds of number representation are invalid in Prolog and therefore would not be properly handled by number_chars.

expr/2 only succeeds when expr/3 succeeds instantiated with an empty list as a third argument. In general, the recursive predicate expr(As,R,Bs) succeeds when computing a reversed RPN expression in As ends by producing an overall result R and leaving out an unprocessed tail Bs of As. Consequently expr/2 is satisfied precisely when the initial token sequence contains a valid (reversed) RPN expression.

Within expr, token type recognition built-in predicates are used to tell numbers from operators in the list of tokens. In the operator handling clause, the operator =.. (which is in fact also a predicate, but here in infix form) is being made use of so that the list [T,X,Y] containing the arithmetic operator with its arguments is transformed into an arithmetic term, such as e.g. +(3,5). Evaluation of the term, as activated by the arithetic predicate is, produces the result of the arithmetic operation. (Res is E here actually stands for call(Res is E), or canonically call(is(Res,E)); application of the predicate call in contexts such as this one is so useful that it is implied).

It should be noted that some of the predicates that constitute the program, when used in isolation, are able to produce more than one result. Such is the case with gets/3, which will also be satisfied by strings consisting of more than one lines. gets and others can be modified in order to avoid those superfluous and incorrect results. This, however, is not needed when the predicates are used as here, because their behaviour is controlled by the ways the respective callers are being satisfied, so that producing the mentioned secondary results is not possible.

:- initialization(rpn).

rpn :- gets(S), rpn(S), halt.
rpn([]) :- rpn.
rpn(S) :- ((tks(S,Ts), expr(Ts,Res)) -> write(Res); write(error)), nl, rpn.

gets(R) :- gets([],R).
gets(L,R) :- getc(C), gets(L,R,C).
gets([' '|L],L,'\n').
gets([],R,' ') :- gets(R).
gets([' '|L],R,' ') :- gets([' '|L],R).
gets([],R,C) :- gets([C,' '],R).
gets(L,R,C) :- gets([C|L],R).

getc(C) :- get_char(X), getc(X,C).
getc(C,' ') :- C == '\t'.

tks(S,[Tk|Ts]) :- append(T,[' '|R],S), gettk(T,Tk), tks(R,Ts).

gettk([T],T) :- member(T,[+,-,*,/]).
gettk(T,X) :- reverse(T,Tr), sign(Tr,Ta,Sgn), numcan(Ta,Tc)
            , number_chars(Y,Tc), X is Sgn*Y.


numcan(As,Ac) :-
, length(Ld,Nd), Nd =< 1
, findall(C,(member(C,As),char_code(C,A),member(A,"0123456789")),Lg)
, length(Lg,Ng), Ng > 0, N is Ng+Nd, length(As,N)
, ((N1 is N-1, Ld = [N1]) -> append(Ac,[.],As); Ac = ['0'|As]).

expr(Ts,Res) :- expr(Ts,Res,[]).
expr([T|Ts],T,Ts) :- number(T).
expr([T|Ts],Res,Ts2) :- atom(T), expr(Ts,Y,Ts1), expr(Ts1,X,Ts2)
                      , E =.. [T,X,Y], Res is E.

length([_|L],N) :- length(L,N1), N is 1+N1.

reverse(L,R) :- reverse(L,[],R).
reverse([X|L],M,R) :- reverse(L,[X|M],R).

append([X|A],B,[X|C]) :- append(A,B,C).

member(X,L) :- append(_,[X|_],L).