A Predictive Bottom-up Evaluator

Manuel Vilares Ferro and Miguel Angel Alonso Pardo
Email: vilares@dc.fi.udc.es, malonso@cirp.es

Appeared in Volume 8/4, November 1995

Keywords: bottom-up.

Abstract

This paper presents a recursive query processing strategy on Horn Clauses which plays, with respect to classic context-free parsing algorithms, a similar role to the well known LALR compilation schemes. Over a base structure equipped with an efficient context-free parsing algorithm based on dynamic programming techniques, we define a logic push-down automaton which breaks computations into combinable, sharable and storable sub-computations. The latter provides computation sharing and operational completeness and solves most of the problems posed by classic depth-first left-to-right traversals.

1. Introduction

Context-free parsing represents a fine starting point for understanding the computational aspects of syntactic phenomena in full first order logic based on Horn Clauses. Indeed the Prolog language that popularized them was initially intended as a powerful formalism for describing natural languages based on a context-free backbone [1]. Clear, concise and powerful, they have been largely used for implementing natural languages analyzers [2]. However, there seems to have been few attempts to see whether known theoretical and technical results on context-free parsing (CFP) could be generalized to Horn Clauses, specifically the use of push-down automata (PDA's) and the corresponding form of non-deterministic reasoning [3].

The aim of the work, which is partially reported here, is to devise general purpose algorithmic techniques to deal with formalisms within the continuum that ranges from propositional Horn logic to full first order Horn logic, in practical applications. To attain this goal, we have chosen to work in the context of logic push-down automata (LPDA's) [4], a formal logical engine for the execution of definite clause programs (DCP's) that generalizes the dynamic programming aspects of earlier evaluation strategies. A LPDA is essentially a PDA, in general non-deterministic, that stores logical atoms and substitutions on its stack, and uses unification to apply transitions.

We propose a very simple technique to avoid, in most practical cases, evaluation conflicts by adding look-ahead to items in LPDA's following a technique close to that applied by LALR machines in CFP, the lower end of the continuum of Horn-like formalisms. It is possible that not all of the material in this paper is new. We have, however, felt the need for a consistent and coherent exposition.

2. An Informal Overview

The abstract operational model represented by LPDA's allows most of the execution strategies of Definite Clause Programs (DCP's) to be unified in the same framework, permitting easy comparison and performance analysis. In relation to this, practical experience has shown that the most efficient methods [4] seem to be those bottom-up approaches including a predictive phase in order to restrict the computation to a useful part of the search space. This relies again on the CFP theory, where the consideration of efficient determinization techniques, such as the classic LR ones, differentiate the syntactic contexts in function of the state where the parsing process is, by virtue of the application of a predictive technique in the generation of the parser.

The use of bottom-up evaluation with top-down predictive control is not new. The Magic Set [5, 6, 7, 8] techniques reduce the number of useless intermediate results produced by using a goal-oriented bottom-up evaluation. The main drawback of this approach is the extra cost these methods require. In effect, the addition of predictive control is made by re-writing logic programs and introducing new predicates. In order to avoid this, we have extended a simple bottom-up evaluation strategy [3] on LPDA's by including a predictive control directly derived from unrestricted CFP [9] in dynamic programming. This control is given, in the simplest version, by a LALR(1) automata, possibly non-deterministic, directly computed from the context-free grammar Gf obtained by considering a parsing rule for each clause in the intensional database. Using the simplest technique this construction is made by keeping only the functors of the logic terms.

In order to describe the recursive query processing strategy, given a clause Yk: Ak,0 :- Ak,1, ..., Ak,nk, we introduce:

The vector Tk of the variables occurring in Yk.
The predicate symbol Vk,i. An instance of Vk,i(Tk) indicates that all literals from the ith in the body of the clause Yk have been proved.

We can now formalize the compilation scheme by the transitions:

for all Yk:

Ak,nk {Ek} --> Vk,nk(Tk) Ak,nk
Vk,i(Tk) Ak,i --> Vk,i-1(Tk), i a member of [1,nk]
Vk,0(Tk) --> Ak,0

for the reduction mode, and

4. for all Yk Ak,i {Ek,i} --> Ek,i Ak,i, i a member of [0,nk)

for the scanning mode. In each case, {Ek} makes reference to the extra control conditions established by the LALR(1) scheme associated to the grammar of functors Gf. So, in the reduction mode, {Ek} and {Ek,i} represent the conditions over the look-ahead in the production of Gf corresponding to Yk. In the scanning mode, this condition represents the verification of the existence of a term with the same functor as Ak,i+1 in order to avoid useless push actions. Briefly, we can interpret these transitions as follows:

1. Selection of a clause: When the extra condition {Ek} is verified and taking into account the literal Ak,nk on top of the stack, select the clause Yk whose head is to be proved; then push Vk,nk(Tk) on the stack to indicate that none of the body literals has yet been proved.

2. Reduction of one body literal: The position literal Vk,i(Tk) indicates that all body literals of Yk following the ith have been proved. Now, for all stacks having Ak,i just below the top, we can reduce them and in consequence decrement the position literal.

3. Termination of the proof of the head of clause Yk: The position literal Vk,0(Tk) indicates that all literals in the body of Yk have been proved. Hence, we can replace it on the stack by the head Ak,0 of the clause, since it is now proved.

4. Pushing literals from the database: The literal Ek,i is pushed onto the stack, assuming that literals will be needed in reverse order for the proof.

To shorten the description of transitions we have not detailed the set of possible simplifications, some of them have already been specified in previous work [4] and a lot of them have been derived from the particular strategy considered.

The technique described has also been generalized to a more complex schema by extending the concepts of firstn and follown from classic LR parsing to DCP's in a natural manner [9], extending context-free derivation by using unification in a depth-first study of the database. Practical tests have shown that, applying simple prediction schema as LALR(1), the percentage of useless intermediate results not generated is similar to that achieved in corresponding CFP analyzers.

References

[1] A. Colmeraeur, 'Metamorphosis Grammars', in Natural Language Communication with Computers.LNCS 63., L. Bolc ed., Ed., 1978.

[2] F.C.N. Pereira and D.H.D. Warren, 'Parsing as Deduction', in Proc. of the 21st Annual Metting of the Association for Computational Linguistics, 37 - 144, Ed., Cambridge, Mass., USA, 1984.

[3] B. Lang, 'Towards a Uniform Formal Framework for Parsing', in Current Issues in Parsing Technology, M. Tomita ed., Ed., pp. 153 - 171. Kluwer Academic Publishers, 1991.

[4] E. Villemonte de la Clergerie, Automates à Piles et Programmation Dynamique, PhD thesis, Univ. of Paris VII, France, 1993.

[5] F. Bancilhon, D. Maier, Y. Sagiv, and J. Ullman, 'Magic-set and Other Strange Ways to Implement Logic Programs', in Proc. of the 5th ACM Symp. on Principles of Database Systems, 1986.

[6] U. Nilsson, 'Abstract Interpretation: A Kind of Magic', in Proc. of PLILP'91, 1991.

[7] R. Ramakrishnan, 'Magic Templates: A Spellbinding Approach to Logic Programs', in Proc. of the 5th ICLP, 1988.

[8] H. Seki, 'On the Power of Alexander Templates', in Proc. of the 8th ACM Symp. on Principle of Database Systems, 1989.

[9] M. Vilares Ferro, Efficient Incremental Parsing for Context-Free Languages, PhD thesis, Univ. of Nice, France, 1992.

Manuel Vilares Ferro
Computer Science Dept
Univ. of Corunna
Campus de Elviña S/N, 15071 A Coruña, Spain
Email: vilares@dc.fi.udc.es
 
Miguel Angel Alonso Pardo
Ramón Piñeiro Linguistic Research Center
Santiago-Noia Road, 3 Km, A Barcia
15896 Santiago de Compostela, Spain 
Email: malonso@cirp.es

Partially supported by the Eureka Software Factory project, and by the Autonomous Government of Galicia under projects XUGA10501A93 and XUGA20403B95.