Efficiency Questions

Appeared in Volume 9/2, May 1996

Keywords: style.

ok@goanna.cs.rmit.edu.au
Richard A. O'Keefe
20th February 1996

Dave Sharpe writes:
There are 10 prove/12 clauses, 6 of them are deterministic.

The desision on which prove/12 predicate fires involves a test on the first argument against some other argument. The first argument is a list of lists. For example, one deterministic desision is like:

prove([[_,X,_,_,_]|As],[[Y,_,_]|Bs], /*10 other args */ ):-
    X =< Y,
    !,
    %rest of this clause

It looks as though you are using lists way too much. You should only use lists when you have a sequence or set of elements all of the same type to represent. Here it looks as though:

the first argument of prove/12 is a list of things. Let's call them Moths. Each Moth has 5 fields.
the second argument of prove/12 is a list of other things. Let's call them Wasps. Each Wasp has three fields.

So:

prove([moth(_,X,_,_,_)|Moths], [wasp(Y,_,_)|Wasps], /*other args*/) :
    X =< Y,
    !,
    ...

You'll see that this is a lot more readable than lots of brackets, because the words tell you what you are dealing with.

It would not be faster, but it might be even clearer, to write:

prove([Moth|Moths], [Wasp|Wasps], /*other args*/) :-
    moth_before_wasp(Moth, Wasp),
    !,
    ...

moth_before_wasp(moth(_,X,_,_,_), wasp(Y,_,_)) :-
    X =< Y.

You can turn this clearer form back into the second form by using macro processing. See the discussion in "The Craft of Prolog".

Dave Sharpe writes:
Would it be (a lot) more efficient to have a single if...then...else top level clause to avoid repeated argument instantiation?

As a rule of thumb, if-then-else is okay for arithmetic tests (like X =< Y) which cannot be expressed as pattern matching, but anything that can be expressed as pattern matching in the head of a clause should be.

He continues:
If so, how does one handle deterministic and nondeterministic choices correctly? Is there a more efficient data structure than a list of lists to have as the primary argument to decide on?

Yes. But it all depends on what your data structures mean.

bimbart@cs.kuleuven.ac.be
Bart Demoen
21st February 1996

Richard A. O'Keefe wrote:
As a rule of thumb, if-then-else is okay for arithmetic tests (like X =< Y) which cannot be expressed as pattern matching, but anything that can be expressed as pattern matching in the head of a clause should be.

The "should" in this rule of thumb is very annoying: many people prefer:

p(X) :- X = a ->
                end
        ;
                do-something.

(perhaps with different indentation :-)) instead of:

p(a) :- ! , end.
p(X) :- do-something.

ok@goanna.cs.rmit.edu.au
Richard A. O'Keefe
22nd February 1996

Bart Demoen writes:
The "should" in this rule of thumb is very annoying...

I did say that anything which could be expressed as pattern matching should be, and "does not unify with 'a'" in your example cannot be so expressed.

I should make it clear that modern compilers (e.g. Kliger's compiler for LOGIX, the Mercury compiler) generate exactly the same code for:

p(X) :- X = a ...

and

p(a) :- ...

So no "efficiency" argument arises. The question is how easy it is for people to understand. As for me, I have learned to fear and distrust:

p(a) :- !, end.
p(X) :- do-something.

not on the grounds of where the unification is done but on the grounds that it:

manipulates a defaulty representation ('a' versus not-'a'), and
does so by unification rather than pattern matching, without a mode or 'when' declaration, This makes it logically unsound.

If the code were:

?- p(X) when ground(X).
p(a) :- !, end.
p(_) :- /* not 'a' */ do-something.

that would be fine. Or in the presence of a Mercury 'mode' declaration (which has scheduling force, as opposed to DEC-10 Prolog mode declarations which expresses unenforced wishes).

Having said all that,

p(X) :- X = a ->
                end
        ;
                do-something.

does not appeal to me because:

(1) I have learned to loathe large indentation steps. 2 to 5 is the extreme tolerable range for indentations, with 3 or 4 being good. Long-time readers will note that I have changed my own Prolog style from:

p(...) :-
<8 cols>q(.....),
...
<8 cols>r(....).

to

p(...) :-
<4> q(......),
...
<4> r(....).

(2) I strongly dislike indentation schemes where the indentation level of some construct does not depend on the logical structure of its context but on the length of the spelling of something. Hence:

p(X) :- X = a
01234567

is bad because if we change to:

pee(X) :- X = a
0123456789

the clause body has to shift over, whereas:

p(X) :-
<4> X = a

is controlled by the logical structure, and renaming to:

pee(X) :-
<4> X = a

does not change the logical structure, so doesn't change the indentation either. Since I like long self-explanatory predicate names, this is also related to (1); I want to be able to have predicates with 30 characters in their names without having to indent by 32.

(3) I use a wide range of languages: C, Ada, Scheme, Prolog, shells. I want a layout style which can be consistently applied across all these languages. The "comb" style I use is consistent with the recommendations in Ledgard and Tauer:

Professional Software, Volume II: Programming Practice, chapter 8, Henry Ledgard with John Tauer, Addison-Wesley 1987, ISBN 0-201-12232-4

(Actually, the whole of that volume is good value, as is volume 1.)

That style means using:

<if> <condition> <arrow>
    <body>
<elseif> <condition> <arrow>
    <body>
<else>
    <body>
<endif>

where <if>, <arrow>, <elseif>, <else>, and <endif> are replaced by different tokens or token sequences depending on which language I'm using. That's why I use:

(   X = a ->
    end
;/* X /= a */
    do-something
).

I want to stress here that these three principles are:

not peculiar to me; they are derived from the general literature on how to make programs more readable
not peculiar to Prolog.

So now the choice is between:

?- p(X) when ground(X).

p(X) :-
    (	X = a ->
        end
    ;/* X /= a */
        do-something
    ).

and:

?- p(X) when ground(X).

p(a) :- !,     % I think of ":-!" as one construct
    end.
p(_) :-	       % not 'a'
    do-something.

Can we find grounds, other than taste, for preferring one to the other?

(a) If the term to be tested were large, and if it were used in more than one place, it would be easier to get the first version right. For example,

p(X) :-
    (   X = f(g(_),i(_)) ->
        q(X)
    ;   X = f(i(_),g(_)) ->
        r(X)
    ;/* neither of the above */
        s(X)
    ).

Now this can't be done by pattern matching in Prolog syntax. But it can in ML syntax, and the adoption of this in Prolog has often been urged in the comp.lang.prolog newsgroup. If Prolog syntax permitted:

p(f(g(_),i(_)) @X) :- !,
    q(X).

p(f(i(_),g(_)) @X) :- !,
    r(X).

p(X) :- % neither of the above
    s(X).

the contest would in my mind be tipped the other way.

(b) When you are looking for the controlling test, the second version puts what you are looking for as early as possible, while the second version delays it for several tokens and at least one line.

p(a
  ^ here it is!

p(X) :-
    (	X = a
            ^ here it is!

I find this especially helpful when there are several clauses.

In this particular instance, my original recommendation (do with pattern matching what can be done by pattern matching) stands, but when either an if or a cut is required, I've found grounds for preferring one or the other:

if the test involves doing arithmetic or matches against a complicated term or a variable here would have additional uses

then an if may increase readability.