Pure

Pure (for Pure Universal Rewriting Engine) is a strongly and dynamically typed functional programming language based on term rewriting. It is the much improved successor to the Q language by the same author – see the link below. (That Q is not to be confused with the current incarnation of K.)

Pure is strongly influenced by Haskell with respect to its syntax, some of its constructs, and its standard library, but in the rest is considerably different.

Overview

The basic datatypes of Pure include integers, floating-point and complex numbers, and symbols. Aggregate data is represented by strings, lists, matrices, tuples, records, and terms.

Lists are unidirectional (‘consed’) and heterogeneous. Strings are treated like lists, although single characters do not constitute a separate datatype – apparently assuming that in most cases of operating on individual characters one can make do with strings of length one.

Tuples are also heterogeneous but flat – no nesting is allowed – which is somewhat limiting. On the positive side, certain operations on tuples get simplified through this; for example, concatenation of tuples is no different than tuple construction. Tuple elements can be referenced by ordinal numbers just like list elements.

Records are similar to tuples in being flat and ordered (so that their elements, too, can be referenced by ordinal numbers), but consist of associative pairs. An associative pair has the form key => value where the key is either a symbol or a string. A property specific to records is that they can be subscripted by keys to produce the associated values. Thus records can be viewed as associative tables. A record can be produced by extending or reducing another record, or by means of converting a list or a tuple.

=> is not a record-related syntax but a regular operator. As such, it can be used in general expressions as well as in patterns, with all the implications regarding the interpretation and the uses of terms in the language.

In order to facilitate numeric applications, Pure directly supports computations with matrices. For example, it offers matrix comprehensions, analogous to list comprehensions. With respect to its vector and matrix features, Pure borrows much from the Octave language for numerical computing.

Symbolic matrices are also handled. Records are considered matrices, so matrix operations, matrix comprehensions etc. are available for manipulating records.

Terms are the data structure that is most characteristic of the language. Roughly speaking, a term is any syntactically correct expression which contains parts that, due to the presence of unbound symbols, cannot be evaluated. More is said on terms in relation to rewriting below.

Pure is fully equipped with the machinery that one is used to expect from a modern functional language: anonymous and higher-order functions, local definitions, closures, partial application, as well as syntactic conveniences such as pattern matching, guarded expressions, operator sections and list comprehensions. Evaluation is eager by default but lazy mode is possible through applying a deferring operation. There is a standard library with a ‘prelude’ part that is made available to the program by default, and other parts includable as needed. The prelude provides a set of operations, mostly on lists, whose semantics and even names closely follow those of Haskell.

Some functions and special forms are built-in. Among these there are facilities for metacircular interpretation and exception handling.

Other important features of the language are: hierarchical namespaces with public and private members; user-defined operator syntax (including precedence and associativity); a very simple module system (not unlike C's); compile-time macros.

Pure is not purely functional but includes certain constructs for imperative programming.

Despite its apparent syntactic similarity to Haskell, there is no offset rule in Pure: the language is free-syntax, with semicolons used as terminators or separators for many constructs.

The Pure interpreter offers an advanced working environment, with a REPL, a debugger, and other useful features. The programmer can interactively include modules, add definitions, and evaluate expressions. Definition levels are provided so that one can manipulate groups of definitions by covering and exposing them according to a LIFO discipline. This completeness of the programming environment is in contrast with the environments for Haskell and other statically typed functional languages.

Rewriting

A distinguishing feature of Pure is its ability to do symbolic computations, the most conspicuous manifestation of this being term rewriting.

Similar to other functional languages, a program is a collection of defining equations making use of pattern matching. Matching a pattern on the left-hand side of an equation yields the corresponding right-hand side. However, the matching-based evaluation in Pure is more general in two important respects. First, the equations, rather than being just function definitions, are general transformational rules: an expression is being matched against a l.h.s. through structural decomposition, as if all functions (and operators) in it are data constructors in a language with constructive types such as ML or Haskell. Second, the r.h.s. of an equation may contain undefined symbols, and consequently, undefined or partially defined terms of any complexity, so it does not necessarily get fully evaluated.

For example, having defined x^2-y^2 = (x-y)*(x+y), all expressions matching x^2-y^2 get transformed according to the rule; thus e.g. (a+3*b)^2-(u/v)^2 yields (a+3*b-u/v)*(a+3*b+u/v). Only as much is done to transform an expression as a rule prescribes, so, by the same rule, (a+b)^2-b^2 yields (a+b-b)*(a+b+b), and (a+3)^2-(3+b)^2 yields (a+3-(3+b))*(a+3+(3+b)) (assuming that a and b are undefined). Several rules can apply in sequence to transform an expression. In the presence of several rules, or when a single rule can be applied more than once, an expression gets transformed until none of the rules is applicable.

Undefined symbols can represent data items, data constructors, or both, the actual use depending on conventions implicitly set up by the programmer. For example, by choosing nil and cons as analogues to [] (the empty list or the constructor of it) and (:) (the non-empty list constructor), an expression such as cons "this" (cons 1 (cons here nil)) can be interpreted as representing a list, similar to the one in a standard form: "this":1:here:[] (["this",1,here]). One way to build the above expression is by evaluating foldr cons nil ["this",1,here] (foldr is a standard library function like the one in Haskell and other languages). In fact, the function foldr cons nil transforms any list to its cons & nil counterpart.

In general, symbols not bound to values represent themselves and thus are being preserved in evaluating an expression. An unbound (undefined) symbol that is syntactically in a function position can be regarded as a data constructor, as with cons in the given example. An unbound symbol in an ‘argument’ position can be thought of as either a zero-argument constructor or a constant.

Note that, as there are no type declarations in Pure, the just described notion of data constructor is the only one in the language, but it is effective due to the fact that expressions with unevaluated parts are legal data objects (terms) in the language. Moreover, Pure makes no distinction between functions (or operators) and constructors other than that the latter do not really compute anything. In all other respects functions and constructors are treated equally. For instance, a constructor can be passed as an argument to a function wherever that argument can be a function – as with cons in the call of foldr above.

A constructor can be thought of as a function name for which there are no defining equations. Conversely, a name of a function or an operator acts as a constructor when the expression of which it is a part cannot be transformed (no equations apply to its terms).

Rewriting combines with other language features for effectively building data structures that are not easily achievable within a conventional static type system. Adding the following equation to the program:
cons x::int (cons y::int rs) = cons y (cons x rs) if x>y;
has the consequence that all lists of integer values constructed by means of cons and nil automatically get arranged in ascending order: in result we have built a datatype of ordered lists of integers. Construction of lists with other kinds of elements is not affected. The type-restricted application of the rule is specified through attaching type qualifiers (::int) to the respective parts of the pattern.

An important point to stress here is that there is nothing specific to constructors in the above example. Functions or operators could be used just as well. With the rule
x::int : y::int : rs = y:x:rs if x>y;
all built-in lists of integers automatically get sorted in ascending order.

Constructors can be used as type qualifiers in just the same way as the names of the built-in types. Thus programmer-defined types, as embodied in constructors, are indistinguishable from those predefined in the language.

As new defining equations can be added anywhere in the program, it is possible to extend or refine any operator or function, including the built-in and library ones, and do that in arbitrary order. In particular, functions can be made highly (ad hoc) polymorphic.

Sometimes the applicability of a set of transforming rules must be restricted to a chosen part of the program. One way to do this is the function reduce, which transforms an expression by augmenting the set of currently effective rules. For example,
reduce u-v with u = a; v = a-b end
produces the result a-(a-b) of transforming u-v by substituting a and a-b respectively for u and v. Similarly, the call
reduce [k*u,u+v] with u = a-b; v = a/b end
produces [k*(a-b),a-b+a/b] by substituting u = a-b and v = a/b in [k*u,u+v].

As terms are data objects, the term to be transformed can be the value of a variable. The last example can be written equivalently
reduce t with u = a-b; v = a/b end,
provided that t's value is [k*u,u+v].

It is convenient to wrap the transforming rule(s) along with a call of reduce, and then apply the resulting function to whatever terms we need to. Having defined
dv = reduce with (p+q)/r = p/r+q/r end,
e.g. dv ((a+i*b+j*c+k*d)/2) yields (applying the rule thrice) a/2+i*b/2+j*c/2+k*d/2.
An effective (although inefficient) list sorting function can be defined by means of a local rule:
sort = reduce with x:y:rs = y:x:rs if x>y end

Links of Relevance

Home page of Pure

Home page of Q, the predecessor of Pure

boykobbatgmaildotcom