A Metaphor for Prolog Procedures using Difference Lists

Appeared in Volume 6/2, May 1993

Keywords: lists.

Philip R. Nachtsheim

Introduction

In list processing, certain applications of the technique of accumulator passing are related to applications of difference lists. However, because of the way that these tools are expressed the relationship is obscured.

Using Prolog's definite clause grammars to formulate procedures dealing with applications of difference lists provides a convenient programming structure to clarify the relation between the aforementioned tools.

A metaphor based on certain features of the grammar rules is proposed as an alternate formulation of program layout for procedures dealing with difference lists. This alternate formulation permits the development of efficient code by using the structured difference lists inherent in the grammar rules.

Formulation

Two useful features of the grammar rule expansion will be exploited to develop a strategy for programming with difference lists.

The first feature is that the expansion can be regarded as a transformation which allows us to partition a list as if it were an array or sequence.

Consider the resulting expansion of:

list --> [a,b,c]

to:

list(S0,S) :- c(S0,a,S1), c(S1,b,S2), c(S2,c,S).

The predicate c(S1,X,S2), read as "point S1 is connected by terminal X to point S2", is defined by the unit clause c([H|S],H,S).

Here list(S0,S) is regarded as a subsequence of the sequence starting at the point S0. A subsequence will include at least two arguments Start and End. The elements of the subsequence or segment will include the elements pointed to by Start up to but excluding the elements pointed to by End. The query "list(Start,End), End = []." will display the segment (S0,S) above as an ordinary list with Start being bound to [a,b,c].

The second useful feature is that the expansion connects one segment to another.

list --> [a],[b,c].
is equivalent to
list --> [a,b,c].

It is as if [b,c] was appended to [a].

The above features allow the following two procedures to be formulated as grammar rules. The first is 'reverse':

reverse([H|T]) --> reverse(T), [H].
reverse([]) --> [].

The expanded form which exhibits the difference lists is shown below.

reverse([H|T],S0,S) :- reverse(T,S0,S1), c(S1,H,S).
reverse([],S,S).

reverse(Rem,Rev,Sofar) is true if the elements of Rem (reversed) are contained in the segment between the sequence Rev and the sequence Sofar; or, equivalently the sequence Rev is the result of appending the sequence Sofar to the elements of Rem (reversed). The last two arguments of the predicate 'reverse' represent difference lists. The first of these arguments is the head and the second the tail of the difference list.

Unfolding the 'c' predicate yields for the first rule:

reverse([H|T],S0,S) :- reverse(T,S0,[H|S]).

reverse(Rem,Rev) where Rev is the reverse of Rem is defined by the relation:

reverse(Rem,Rev) :- reverse(Rem,Rev,[]).

A reading of the reverse rules as a grammar is possible.

Now for the metaphor: regard the additional arguments which appear when the grammar rules are expanded as pointers to an array of records of variable length. Each predicate term of the grammar rule corresponds to a record. Because the records are of variable length each term contains a pair of pointers Start and End say, one to indicate where it starts and another to indicate where it ends and possibly where the next record will start. The record contains the output of the term. The record or segment corresponding to the term is a subsequence of the array of records. It consists of those elements starting with and including the the sequence Start and up to but excluding the sequence End.

The analogy we are suggesting is that grammar rule term literally denotes the segment of the array which represents the output contribution of that term. In the case of 'reverse' above, regard reverse(T) as an identifier of the segment (S0,S1) where reverse(T,S0,S1) is the expansion of reverse(T). Likewise regard [H] as an identifier of the segment (S1,S) where c(S1,H,S) is the expansion of [H]. One term followed by another denotes the concatenation of the output of the individual terms. Hence, the segment (S0,S) is denoted by reverse([H|T]). It is obvious that this metaphor only applies to those procedures whose output consists of a single list. In the spirit of this metaphor, perhaps we should call the records additive arrays rather than difference lists.

As pointed out by O'Keefe (1) the advantage of the grammar rule notation is that the last two arguments of the predicates, which are being used in a very rigid and predictable manner, don't appear in the code, so it is easier to see the interesting part.

Even though the grammar rules provided a convenient programming structure which hides the last two arguments of the predicates, for each procedure, a "front-end" must be provided as ordinary Prolog code to display the arguments we have been hiding. In addition to the input arguments which appear in the grammar rules two additional arguments must be supplied to the "front-end": a pointer to indicate where the output will start, and a pointer to indicate where it will end. In order to terminate the output record simply supply [] as the second additional argument.

Accumulators and Difference Lists

As an application of the analogy, the difference list formulation of 'reverse' will be compared to the version of 'reverse' which uses an accumulator.

The technique called accumulator passing in Prolog is used to simulate a mutable variable. An accumulator pair consists of the accumulator as an input parameter which contains the current value and the output parameter which is called a result by O'Keefe (1). So a mutable variable is simulated by two Prolog parameters. A common example to illustrate accumulator passing is list reversal. In the procedure 'reverse' above Sofar serves as the accumulator and Rev as the result.

When building lists when the accumulation is taking place in the body of the procedure as in 'reverse' above, there is no difference between the accumulator formulation and the difference list formulation. Although this is a restricted use of of an accumulator, it is an important special case. Pedagogically, this result relates the two concepts. Prolog texts usually treat accumulators and difference lists separately. Cosmetic difference also obscure the relation. Difference lists are usually expressed as a compound term whereas an accumulator pair is usually expressed as two separate, adjacent parameters of a predicate. In addition, the versions of 'reverse' appearing in Prolog texts usually have the accumulator parameters transposed as compared to version of 'reverse' shown above so as to have "result" as the output parameter.

Examples

The 'reverse' prototype can be used as a model of more general procedures that return results in either order. Also these procedures in some cases can be used to build difference lists.

Some "one liner" examples will illustrate the use of the grammar rules. "One liner" means that the results of the terms to be concatenated will appear on one line.

Consider the very frequent occurrence of choosing an alternative based on the outcome of a test. For example consider the procedure for the intersection of two sets. That is intersection([a,b,c,d,e,f],[d,e,f,g,h,i],[d,e,f]) is true.

intersection([H|T],L) --> {member(H,L)},[H],intersection(T,L).
intersection([H|T],L) --> {not member(H,L)},intersection(T,L).
intersection([],L) --> [].
intersection(L1,L2,X) :- intersection(L1,L2,X,[]).

The test is performed inside the curly brackets and is not part of any sequence; if the test is successful the segment corresponding to the left hand side will consist of the segments corresponding to [H] and intersection(T,L). In effect we have added the element H to the elements in the segment corresponding to the term intersection(T,L).

No output variables appear in the first three grammar rules. The final clause acts like a function call. In the first rule the last two terms could be transposed resulting in the reverse of the previous output. Further, the components of a difference list could be returned by the query intersection(L1,L2,Head,Tail).

Another classic application of difference lists is flattening a list. That is flatten([[a],[b,[c,d]],e],[a,b,c,d,e]) is true.

flatten([H|T]) --> flatten(H),flatten(T).
flatten(H) --> {atomic(H),not H = []}, [H].
flatten([]) --> [].
flatten(X,Y) :- flatten(X,Y,[]).

An interesting aspect of this program is that if the two terms in the body of the first rule are transposed, the result is the reverse of the result of the original version. This is in contrast to the difference list version of 'flatten' in Sterling and Shaprio (2). There the same result is obtained when the terms are transposed. In their version the difference list structure is explicit in the code, and transposing the terms merely alters the order in which goals are executed.

Conclusions

An alternate formulation of program layout and style using difference lists is proposed. Usually the use of difference lists is motivated by avoiding appending lists and is treated as an advanced topic. This alternate formulation permits the development of efficient code by using the structured difference lists inherent in the grammar rules. Also the relation of accumulator pairs to difference lists is discussed.

References

1. O'Keefe. R.A. The Craft of Prolog, The MIT Press, Cambridge, Mass. 1990.

2. Sterling. L. and Shaprio. E. The Art of Prolog, The MIT Press, Cambridge, Mass. 1986.

Philip R. Nachtsheim
AI Research Branch
Ames Research Center, NASA
Moffett Field, CA 94035, USA
Email: nach@ptolemy.arc.nasa.gov