Alias Variables

Appeared in Volume 9/1, February 1996

Keywords: variables.

thomasl@groucho.csd.uu.se
Thomas Lindgren
25th November 1995

S.Bharadwaj Yadavalli writes:
Can somebody kindly point me to (probably a set of ) references that explain in detail, alias variables in Prolog and flow analysis work that tries to deduce this information at compile-time ?

Okay, and I will also take the opportunity to tell the readers why this is important. Some references at the end of this message.

Detecting when variables are aliased is useful mainly in two contexts in Prolog: for mode analysis and for parallelization that relies on parallel goals not interfering with each other.

Aliasing appears when two program variables may or must refer to the same variable (i.e., may/must share unbound variables at the actual execution of the program).

For modes, consider the example:

main :- p(A,B), write(A), write(B).
p(X,Y) :- X = Y, X = a.

We want to detect the mode of Y on exiting p/2.

If we (wrongly) ignore the fact that X and Y are aliases, then we can reason that Y is unbound after p/2, since it is not bound to anything non-variable while executing p/2. Thus, we would claim the exit mode of p/2 to be p(+,-), which is wrong.

On the other hand, if we keep the fact that X and Y actually refer to the same value (a "must-alias"), then we see that when X is bound, so is Y.

Thus, we find that the exit mode is p(+,+), which is correct.

For correctness, we are required to trace MAY-aliases, however. Take the following:

main :- p(A,B), write(A), write(B).
p(X,Y) :- ( X = Y ; true ), X = a.

Once again, we want the mode of Y after p/2. But when X = a is executed, can we say Y is definitely bound? No, because we do not know which arm of the disjunction was executed (e.g., due to backtracking). So, to either say that the exit mode is p(+,-) or p(+,+) is wrong.

Instead, we say that X and Y MAY be aliased after the disjunction. When X is bound, Y MAY be bound (or it may not). So, we change the mode of Y into "?" when X is bound, and get the exit mode p(+,?).

To summarize: for mode analysis, aliasing is required for correctness (if we want to track unbound variables / "-" modes) and useful for precision.

For parallelization, such as in &-Prolog, the crucial issue is that goals that share unbound variables at runtime may influence each other in a way that diverges from sequential execution. To avoid this, &-Prolog parallelizes only goals that will not share variables at runtime. This is found by aliasing analysis (or 'sharing analysis' as it is sometimes known).

For instance, we have the program:

p(In,Out) :- 
    q(In,Q1,Q2), r(Q1,In1), 
    s(Q2,In2), t(In1,In2,Out).

How can this clause be run in parallel with the above restrictions? Let us assume that analysis of q/3 yields that Q1 and Q2 are independent, i.e., they can never share unbound variables at runtime. In that case, we can parallelize the clause as follows:

p(In,Out) :- 
    q(In,Q1,Q2), 
    ( r(Q1,In1) & s(Q2,In2) ), 
    t(In1,In2,Out).

How do we decide whether two program variables can be bound to terms that share variables at runtime? The simplest way is similar to the may-alias one I described above, but there are many more sophisticated analyses.

I recommend the following papers:

S.K. Debray, Efficient dataflow analysis of logic programs, Journal of the ACM, Vol 39, No. 4, October 1992.

S.K. Debray, Static inference of modes and data dependencies in logic programs, ACM Transactions on Programming Languages and Systems, Vol. 11, No. 3, July 1989, pp. 418-450.

S.K. Debray, D.S. Warren, Automatic mode inference for logic programs, Journal of Logic Programming, Vol. 5, No. 3, 1988.

K. Muthukumar, M.V. Hermenegildo, Compile-time derivation of variable dependency using abstract interpretation, Journal of Logic Programming, 1992:13:315-347.

There are many, many others, but I refer you to the collected volumes of JLP, MIT Press, etc., for those.