PLACE |
spec |
solutions |
Bash |
bash has been chosen, among other shells, for being the default shell on all versions of Linux, as well as other Unices.
The program, or script, starts with two necessary settings. The contents of the IFS
variable define the set of characters that are treated as delimiters in splitting strings into ‘words’. set -f
disables the so called filename expansion – a process in which certain characters in command arguments, such as *
and ?
, are considered patterns and implicitly replaced by matching filenames.
The rest of the program repeatedly executes the read
command to obtain text lines from the standard input. The -r
option is for ‘raw mode’, meaning that an input line cannot be continued (by default, \
is treated as a continuation character). The -a
option instructs to split the input into words and store them as an array. Consequently, tks
becomes an array of strings rather than a single string.
If the array turns to be 0-sized, resulting from an empty or whitespace-only input line, the rest of the loop is skipped. If not, tks
's entries are processed one by one in an inner loop, where each string tk
is pattern-matched to find out whether it represents a number, an operator, or is something else. The variable n
increases or decreases according to encountering a number or an operator, and is set to 0 on an incorrect token. An expression is syntactically correct if n
>0 at each step, and n
=1 finally holds.
As bash itself allows computing only with integers, and modelling fractional arithmetic in it is relatively tedious, an expression that is found correct is evaluated by filtering it through the dc
program. The 6k
command sent to dc
before the expression tells it to perform calculations with a precision of 6 digits after the decimal point. The p
after the expression is the dc
's printing command. The expression itself is specified as the contents of the entire tks
array, ${tks[*]}
.
[ dc is an RPN calculator language and interpreter that can be used in a terminal interactively or by writing scripts in it. It is one of the oldest Unix utilities, in fact – ‘the senior language on Unix systems’ (in the words of M. D. McIlroy) – and has been ubiquitous on Unix-like systems ever since. ]
The expressions in double parentheses in the program are a means of bash to do arithmetics and numeric assignments, which otherwise would be written in a more cumbersome way. Similarly, [ ... ]
and [[ ... ]]
are a syntactic convenience for writing down conditions of all sorts. They are similar in function, but the latter is more powerful: some operators, such as =~
here, are allowed in [[ ... ]]
but not in [ ... ]
.
Throughout the program ;
is used for command separation. If not present, an end of line is required at the same place.
IFS=$' \t' set -f while read -r -a tks; do if [ ${#tks[*]} -eq 0 ]; then continue; fi ((n = 0)) for tk in ${tks[*]}; do if [[ $tk =~ ^[-+]?(\.[0-9]+|[0-9]+(\.[0-9]*)?)$ ]]; then ((++n)) elif [[ $tk =~ ^[-+*/]$ ]]; then ((--n)) if [ $n -le 0 ]; then break; fi else ((n = 0)) break fi done if [ $n -eq 1 ]; then echo '6k' ${tks[*]} 'p' | dc else echo 'error' fi done
That it is possible to pattern-match against regular expressions within bash is not essential. The grep utility, standard in a Unix environment, could have been used instead. In order to check whether tk
reads as a number, one would replace the corresponding [[ … ]]
expression with
echo $tk | grep -q -E -e '^[-+]?(\.[0-9]+|[0-9]+(\.[0-9]*)?)$'
,
which runs grep
with the contents of tk
piped into its standard input. Similarly for checking whether tk
is an opertaor.
Also inessential is that dc
which is used to evaluate an expression understands reverse Polish notation. As the expression is already syntax checked and stored token by token in an array, it is easy to carry on the computations by following any of the forward, backward, or reduction strategies. What is indeed essential is implementing real-valued arithmetic.
The following bash script is derived from the previous one. Instead of feeding all the expression into dc
, it evaluates each single operation, as it is encountered, with its two operands and puts back the result in place of the three respective entries in the tks
array. Thus it gradually reduces the expression to a single numeric result. If insufficient number of operands is found for some operation, evaluation is abandoned and eventually an error message is printed. Evaluating an expression is successful if tks
is processed entirely (i
equals n
) and there is a single entry, a number, left in it (n
equals 1). That number is then displayed. Otherwise, i.e. in case of a wrong token, or incorrectly constructed sequence of numbers and arithmetic operators, an error message gets printed.
The inner loop is nominally infinite – a while loop with an always true condition denoted :
– and is really a post-test loop terminating upon i
becoming equal to n
. bash has no construct for expressing post-test loops directly. Oddly, it has an until-do-done
loop which only differs from while-do-done
by inverting the meaning of the condition: the latter defines when to terminate the loop but is still tested before executing the loop's body.
Special care is taken to compact tks
after an operator is applied. As only one entry must be left where there have been three, two of the array elements are given null values, and then tks=(${tks[*]})
ensures that the so created holes are removed from the array by actually constructing it anew.
For nulling variables bash offers the unset
command, so tks[i]=
could be replaced by unset tks[i]
. The difference between assigning a null value and unsetting is that the latter makes bash believe that the array's length (${#tks[*]}
) has decreased. But that is only confusing, as the ‘unset’ indices will still point at null values, and all other indices will still point at where they pointed before unset
. In other words, unsetting array entries makes them void but still present, and compacting is still necessary in order to truly eliminate them.
In this script the use of dc
has been restricted to doing single operations, and of course any other program that can perform +
, -
, *
, and /
would do just as well. An external program will not be needed at all if a shell such as ksh
or zsh
with built-in support for fractional arithmetic is used.
IFS=$' \t' set -f while read -r -a tks; do ((n = ${#tks[*]})) if [ $n -eq 0 ]; then continue; fi ((i = 0)) while :; do tk=${tks[i]} if [[ $tk =~ ^[-+]?(\.[0-9]+|[0-9]+(\.[0-9]*)?)$ ]]; then ((++i)) elif [[ $tk =~ ^[-+*/]$ ]]; then if [ $i -lt 2 ]; then break; fi tks[i-2]=$(echo '6k' ${tks[*]:i-2:3} 'p' | dc) tks[i]= ((--i)) tks[i]= ((n -= 2)) tks=(${tks[*]}) else break fi if [ $i -eq $n ]; then break; fi done if [ $i -eq $n -a $n -eq 1 ]; then echo ${tks} else echo 'error' fi done
boykobbatgmaildotcom