The Early Days of Logic Programming

The Story

One dark morning in October 1972 in Edinburgh, Bob Kowalski sat me down on the couch in his living room. ``I'll show you something interesting. We call it programming in logic."

I had flown in from California the day before to take up my appointment at the Department of Machine Intelligence under the direction of Donald Michie. Bob and Danusia had put me up for the night. Last time I'd seen them was early summer 1971, when passing through Holland, with Bob giving a talk in the School of Architecture in Delft. Among logic programmers I'm unusual in that my wife and I had been friends of the family for a few years. So for me Bob came before logic programming rather than the other way around.

Though I was mere student, and a rather undeserving one at that, the Mathematical Centre in Amsterdam had sent me to the 1968 IFIP conference in Edinburgh. In those days, when computer science was small, the bi-annual IFIP conferences were a big deal and covered all of the field.

I had gotten interested in Edinburgh because of an article in the New Scientist that reported how Rod Burstall and Robin Popplestone were building all by themselves an operating system for an Elliott 1100-series machine to support multiple terminals with users engaged in interactive sessions in POP-2, a language of their design. They called it Project mini-MAC in a challenge to the huge project MAC at MIT to show it could be rivalled by two brainy Brits and a standard computer. These were the days when Donald Michie catapulted Edinburgh into the top rank of Artificial Intelligence research.

If it hadn't been for the war, Michie would have been an undergraduate in the early 1940s. Instead, the military detailed him to work at Bletchley Park in a group that included people such as Max Newman, Jack Good, and Alan Turing. The latter was intensely interested in chess. As Michie recounts it, he was the only one in the group bad enough at the game to be able to bear playing with Turing. In addition, there were conversations that left Michie inspired by the notion of Machine Intelligence.

This would have seemed a flying start in computers and AI. However, AI existed only as a bee in the bonnets of Turing and Michie. For Turing there was an opportunity to continue in computers after the war; as it turned out, a frustrating half-opportunity. Michie was just a school-leaver caught out by the war. For him it was time to start a university education. At the time, the gap between any undergraduate program and anything to do with computers was huge, even undefined. Michie went into medical research and onto a readership at the University of Edinburgh. By the mid-sixties, computers, and even AI, had appeared on the radars of university administrators and funding agencies. Michie took the opportunity to start AI in Edinburgh.

I had planned to do some detective work during the IFIP conference to ferret out this "project mini-Mac". I had not noticed the AI setting of the project. I was crazy about programming. I was used to punching my Algol programs on paper tapes, which I left at a desk where I would pick up my output a few hours later. I was excited by the prospect of interactive computing in a language that went beyond Algol.

The detective work turned out to be unnecessary: an invitation to attend a demo fell out of the conference registration papers. Unfortunately, the demo was not a success. The telephone connection with the Elliott 11XX failed. As the only one who showed up, I got a lot of attention from Ray Dunn and the rest of the team. Michie himself made a cameo appearance. As a consolation Ray took me out to lunch and that is all I got to see of project mini-MAC.

The next spring I applied for a British Council summer visitorship where I specified as target the Experimental Programming Unit in Edinburgh, which is where project mini-MAC lived. The Council accepted. In addition to agreeing to host my visit, Michie asked me for a contribution to the 5th Machine Intelligence workshop to be held in Edinburgh in the summer of 1969. All the things that didn't work in 1968 did work in 1969: POP-2 was running dependably and I had my first experience of interactive computing. In retrospect I realize that this first summer I was just using the subset of POP-2 that corresponds to Algol 60.

Early in the visit Michie invited me for dinner at his home, where I met Alan Robinson. Alan's resolution inference for clausal logic had created a great deal of interest all over the world. The greatest concentration of that interest was in Edinburgh. Robinson had just accepted a position at Syracuse University, the first year of which was a sabbatical spent mainly in Edinburgh.

Given resolution, Kowalski's research programme seemed clear: to reduce, or even eliminate, redundancy from the resolution search space and, in this reduced search space, to employ effective search strategies. As logicians had hardly paid attention to search, it was natural to look to Operations Research. Accordingly, Kowalski collaborated with Bob and Leslie Daniels, who worked in that area. Applying these methods in logic added the excitement that logic is a more flexible modelling tool than Integer Programming. The promise resulted of extending computer methods beyond the traditional commercial and military applications of Operations Research. To Design, for example.

The general idea of computation as deduction was very much in the air. In his paper in the 1968 Machine Intelligence workshop, Cordell Green had outlined various computational uses of logical inferences, modes that we now recognize as querying relational databases, program synthesis, and the use of a clausal sentence as a program.

I returned to Edinburgh in the summer of 1970 and the sixth Machine Intelligence workshop. I discovered the non-Algol parts of POP-2 and learned (from myself) to program with trees (if only I had had the good sense to read Knuth volume I; however, I had heard of Knuth only via people in Amsterdam of whom I had a low opinion, and the name was not heard in Edinburgh).

The run-up to the workshop was enlivened by telegrams from Seymour Papert at MIT announcing on alternating days that he was (was not) coming to deliver his paper entitled "The Irrelevance of Resolution", a situation that caused Michie to mutter something about the relevance of irresolution. The upshot was that a student named Gerry Sussman appeared at the appointed time. It looked as if this was going to be his first talk outside MIT. His nervousness was compounded by the fact that he had been instructed to go into the very bastion of resolution theorem proving and tell the assembled experts how totally misguided they were in trying to get anything relevant to AI with their chosen approach.

I had only the vaguest idea what all this was about. For me theorem proving was one of the things that some people (including Kowalski) did, and I was there for the programming. If Bob and I had anything in common, it was search. Accordingly I skipped the historic Sussman lecture and arrived late for the talk scheduled to come after Sussman's. Instead, I found an unknown gentleman lecturing from a seat in the audience in, what I thought, a very English voice. It turned out that a taxi from the airport had delivered Seymour Papert after all, just in time for the end of Sussman's lecture, which was now being re-done properly by the man himself.

The effect on the resolution people in Edinburgh of this frontal assault was traumatic. For nobody more so than for Bob Kowalski. Of course there was no shortage of counter objections, and the ad hoc creations of MIT were not a pretty sight. But the occasion hit hard because there was a sense that these barbarians had a point.

Kowalski's apparent research program narrowed to showing that the failings so far of resolution inference were not inherent in the basic mechanism. He took great pains to carefully study PLANNER and CONNIVER. And painful it was. One of the features of the MIT work was that it assumed the audience consisted of LISP programmers. For anybody outside this circle (Kowalski most definitely was not a LISP programmer), the flavour is repellent. But even these sensational events did not impel me to find out what this resolution thing was all about.

Back home in Amsterdam the most exciting news was the sabbatical in 1969 - 1970 of Dana Scott in Amsterdam and his work with J.W. de Bakker on the semantics of programs. Again, no contact with what I was doing, but something of intense interest for my friend W.P. de Roever.

In 1971 I spent another summer in Edinburgh, at the end of which I left to take up a post-doctoral fellowship at IBM research in Yorktown Heights. There I broadened my programming experience, learning APL and Lisp. IBM may seem an unlikely place to pick up Lisp, but there it was. There were even some advantages in that respect compared to bastions of Lisp such as MIT and Stanford.

In Yorktown Heights, Lisp led a peripheral, even somewhat underground existence. Its official role was as an implementation tool for a symbolic computation package. Fred Blair, the Lisp guru in the symbolic computation group, explained to me that McCarthy had gotten Lisp about 85% right, but that functions in Lisp were still not first-class citizens. He was going to make sure that IBM's Lisp would get it one hundred percent right. It is this distinction that makes it appropriate to distinguish Lisp from the functional programming ("Lisp done 100% right") that arose in the 1970s. Though IBM's Lisp project may have bitten the dust soon after, Sussman's Scheme would have met with Blair's approval.

Functional programming got to Yorktown Heights in the person of Bill Burge, a reclusive Englisman who had worked on theoretical aspects of programming languages with Peter Landin and Rod Burstall in England. Thus I was exposed to this group's work via two routes: via Burstall and POP-2 in Edinburgh and via Burge and McG in Yorktown Heights. McG was the name of an experimental functional programming language based on Landin's SECD machine implemented in Yorktown Heights by Burt Leavenworth and Burge. At that time one could not take it for granted that something like this would actually be usable. But in Yorktown Heights in 1972 every office had a terminal connected to the interactive CP/CMS system on an IBM 360-67. One of the things I learned from Leavenworth: the beautifully simple Quicksort program, more about which later.

Meanwhile my friends in Amsterdam had been sending me interesting things. Apparently Dana Scott had been coming back to Amsterdam. One of the results of this was a fat pack of lecture notes, all in beautiful handwriting, containing, among other things, the first model of the lambda calculus. Though it was completely over my head, I have reverentially carried it along for many years. The other thing that arrived from Amsterdam was a paper by de Bakker and de Roever applying the Scott - de Bakker fixpoint ideas to binary relations as model for the execution of imperative programs. This was much easier reading. This brings us up to early summer 1972.

As a follow-up to our session on Bob's couch upon my arrival in Edinburgh in the fall of 1972, Bob arranged a few meetings as a sort of crash course for new arrivals in resolution and programming in predicate logic. These included included David Warren and Austin Tate, two new students who were also to work under Michie.

David and I attended Bob's crash course. If there were any others, they couldn't have been more than one or two. By November it was clear that the fixpoint characterization applied to logic programs. In the spring of 1973 Bob and I made good progress with this work. I found it very exciting. I learned about the compactness theorem and we used it. Bob was not happy with it. He had the feeling it could be simpler. When we split for the summer, he suggested I leave it with him to see if he could find a more satisfactory exposition. By the time I returned, he had found the beautifully simple one-step inference characterization that can be found in our JACM paper.

In the course of 1973 Colmerauer and Roussel visited. They reported on their Prolog implementation, which had control annotations on the clauses. Our reaction was lukewarm: it seemed too far removed from logic.

I enjoyed programming in logic, though we couldn't run anything. We had learned how to append lists from the Marseilles people. When I told Bob my latest brainwave:

        member(x,y) <- append(u,x.v,y)

he was gratifyingly surprised.

I circulated quicksort in logic, which made quite an impression. A real algorithm supposed to be executed by a resolution theorem-prover! That should shut up the MIT guys with their vague gripings about "uniform proof procedures". In the absence of a logic programming implementation, Michie urged me to run it on the Boyer-Moore theorem prover. I objected that it would not behave like a logic program. Donald insisted. He was proud and pleased that a theorem-prover had actually sorted a list. I was disappointed: though it had done the right thing almost everywhere, it slipped in a subsumption step.

The summer of 1973 W.P. de Roever, who had sent me the fixpoint paper when I was in Yorktown Heights, visited Edinburgh. That paper had treated a fragment of an imperative language without procedure calls. W.P. was now pursuing an extension that included procedure calls. It turned out that call-by-name resulted in a fixpoint that is different from the one when parameters are called by value.

This seemed the perfect opportunity for my first attempt at a conversion to logic programming. Logic was not only a programming language, but its activity was nothing but procedure calls! Plus, we had this beautiful fixpoint characterization. So I expected him to be jumping up and down in excitement. Instead, he asked whether in procedure calls in logic programming parameters were called by name or by value. When I could not answer this question, he instantly determined that whatever we had in mind, it was not programming. To this day I have not answered the question. This was the first of a long series of negative reactions to logic programming from the most advanced thinkers in programming. A few years later I was teaching logic programming with Prolog to undergraduates. They did not jump up and down in excitement either, but they did their assignments without any fuss.

W.P. and I attended the annual NATO Summer School in Marktoberdorf in Bavaria where parallelism had been one of the topics. In the TEE train coming back, we spent hours thinking intensely, and, in my case fruitlessly, trying to catch the breakthrough in parallelism that we felt was just around the corner. I felt it was around the corner because of the commutativity of the conjunctions in the body of logic program clauses. It was frustrating because I did not even know what the question was we were trying to answer and we did not even realize that.

In December 1973 I visited Holland. I had sent an abstract to E.W. Dijkstra who agreed to let me give a talk about programming in predicate logic, under one condition: that it be given on a Tuesday afternoon. I was encouraged to see EWD almost falling off his chair from nodding in agreement. After the talk he took me in great excitement to his office, where he showed me his new programming constructs: the if - fi and do - od constructs with the fat bars between them. Both are nondeterministic. The former is purely nondeterministic choice, and it was here that he recognized similarity with the clauses in a logic program.

What little interest in Prolog there might have been that summer in Edinburgh continued to evaporate until David Warren returned from his visit to Marseilles in February 1974. He brought two things: a program and a foot-long box of cards. The program was WARPLAN, a one-page program that Bob, I, and others studied for years afterwards. The box contained the new Prolog. This was the second implementation, done in Fortran by Battani and Meloni.

The effect was electrifying. Actually being able to run programs (an opportunity obligingly arranged by David on the new PDP-10 computer in Edinburgh) made a huge difference, though it shouldn't have from a rational point of view. Moreover, the formerly ubiquitous control annotations had gone, being replaced by an occasional cut, which was much less obtrusive. I wrote and tried lots of little programs, some of which ended up in Helder Coelho's project "How to Solve It in Prolog". Though printed much later, this collection circulated in samizdat form for a long time before.

In 1974 the summer brought many distinguished AI visitors. They appreciated the AI research being done in Edinburgh: the robotics work, the theorem provers of Boyer and Moore, and the one of Bundy. One thing they were puzzled about: all this work was programmed in POP-2, whereas everywhere else AI work was done in Lisp.

Some of the visitors were not swept up in the anti-theorem proving campaign emanating from MIT. They believed that there was a lot of valuable potential in resolution theorem-proving. But when these people heard that we actually wanted to program in logic rather than in Lisp, the atmosphere would turn chilly. A prime example was Robinson himself. We thought he would be delighted to see resolution used to turn logic into an actual programming language complete with a beautiful fixpoint theory. He was not. His first and only love in programming was Lisp. Mechanize theorem-proving, by all means. But do it in Lisp. Later, Robinson was to give valuable support to logic programming in many ways, such as the founding of the Logic Programming Journal. He was in favour of research in logic programming. But for him, programming was something you do in Lisp.

Some of the 1974 visitors were unreservedly interested not only in logic programming, but even in Prolog. These included Keith Clark from Queen Mary College, Luis Pereira from Lisbon, and Tomasz Pietrzykowski from the University of Waterloo in Canada.

In the fall of 1974 it was my turn to visit Marseille. Colmerauer showed me the work on his natural-language question-answering system based on non-clausal full first-order logic. Alain had no use for logic programming. He wanted to build parsers for natural language and theorem provers. He needed a better programming language for that, and now he had one. He also showed me a little side exercise: a compiler for a simple Algol-like programming language, written in Prolog.

During my visit I noticed that the interest Pietrzykowski had shown in Prolog during his visit to Edinburgh had not evaporated upon returning to his home university. On Alain's desk there was a letter from Waterloo requesting a copy of the Prolog interpreter. A few days later I noticed the letter on another desk. A few days after that the letter was perched precariously on a window sill.

After my return to Edinburgh Pietzrykowski contacted me for a possible visit to Waterloo. As the arrangements firmed up, he told me he hadn't managed to get a copy of Marseilles Prolog and could I bring one. And so it was that I arrived in Waterloo in January 1975 with two presents: a plastic bag with cuttings of the Papyrus plant and a box with a foot-long stack of punched cards. Sometime in the spring of 1975 two students of Pietrzykowski, Lewis Baxter (later to invent a fast unification algorithm) and Larry Rendell, got the Fortran of the interpreter translated to the local dialect. Thus was established the first Prolog installation in North America.

This was not the first propagation of Marseilles Prolog via Edinburgh. In December 1974 the mathematicians and logicians Hajnal Andreka and Istvan Nemeti obtained their foot-long box of cards and took it back with them to Budapest. Without much delay it got into the hands of Peter Szeredi.

When I was getting ready to leave Edinburgh for good in the summer of 1975, it turned out that I had forgotten to show David Warren Colmerauer's little compiler program. David pounced on it. Within a year he was sending me successively more extensive compilers for Prolog, written in Prolog.

When I arrived in Waterloo, I was welcomed by Lewis and Larry's directions on how to run Prolog. The Mathematics and Computing building housed two big computers. The most conspicuous one was the computing centre's big IBM 370. The Mathematics Faculty couldn't stand the, what shall we say, IBM-ish way the computing centre was run, and had their own big, non-IBM, computer. It was on this one that Prolog was installed.

In 1976 Grant Roberts, a Master's student, was looking for a project. If not writing a compiler, then at least implementing an interesting language. Prolog seemed suitable. As I was the only faculty member who was at least a user of Prolog, Grant was steered in my direction.

Grant was famous in Waterloo, because as an undergraduate he had been on the Waterloo team in the Putnam competition that had done very well in the year he was on it. Grant did not share the computer scientists' disdain of the IBM 370. In fact, he loved this machine. Not because he liked programming in Fortran, or in Cobol, or in PL/I. No, he programmed in assembler. At least that was all that an outsider could make out of his activities. What he actually programmed in was his own programming language, a macro processor.

There are very few people who can do this. There are even fewer who tell about it. One of the exceptions is Mark Halpern in his memoirs in the 1991 Annals of the History of Computing where he says something about his XPOP macroprocessor. But these few tend to perform incredible programming feats. In the case of Roberts the feat was to complete in 1977 a beautifully stable Prolog implementation, the fastest in the world.

In the summer of 1977 I attended a meeting in Saint Andrews, New Brunswick, Canada, entitled "Formal description of programming concepts". The Saint Andrews meeting was organized by Jack Dennis of MIT. One of the ways in which he proved himself to be a good organizer was to bring along two MIT graduate students, Valdis Berzins and David Harel, whose only role seemed to be to record the questions asked after the lectures and the answers given. This was of course overkill for lecturers such as myself. But it was only a fair investment if you take into account that some of the other lecturers were Andrei Ershov, Corrado Boehm, and Edsger Dijkstra.

The paper I presented ("Computation and deductive information retrieval") pointed out that relational databases can be regarded as special cases of logic programs and wouldn't goal statements be a nice query language? In this paper I used "SLD resolution". I did not like Robert Hill's name "LUSH resolution". As I was listening to Hill's talk in 1975, I played around with alternatives, ending with SLD resolution, "SL resolution for definite clauses", where "definite clause" may again have been a nonstandard term for positive Horn clause. The idea behind "definite" was that clauses in general can have more than one conclusion, so are sort of indefinite. When there is only one, the clause is definite.

Perhaps as a result of being distracted by the word "LUSH", I had not been able to follow Robert Hills's completeness proof. In 1975, when clinging to straps in a District Line train on the way to his home, Bob had pointed out certain similarities between logic programs and context-free grammars and suggested that parse trees might be a suitable analog for proving completeness of the logic-programming theorem-prover for logic programs. I liked the idea, but was not confident I could do it by myself.

In Saint Andrews I proposed to Krzysztof Apt to use this idea. I had never met him until the 1977 meeting in Saint Andrews, though he may well have been occupying the very same desk at the Mathematical Centre in Amsterdam that had been mine in 1970 - 1971. Apt had trained as a mathematician and a logician in Poland. He was wonderfully untainted by prejudices in programming and seemed to me an ideal recruit. We didn't get started in Saint Andrews, but managed to do the whole thing by snail mail. And when I found myself in Brazil in 1978, snail mail really was snail. Apt proved himself not only a clever mathematician, but also an ideal collaborator. Mainly through my fault it was not until April 1980 that we submitted "Contributions to the theory of logic programming". The refereeing process was simple: one report, signed "David Harel". Harel said that he had been annoyed by all this talk about "logic programming" until he read this paper. He recommended publication. He could have, and we should have, read Clark's thesis, which had a more general completeness result.

Thus I assisted at two of the more important conversions to logic programming: of Keith Clark in 1974 and of Krzysztof Apt in 1977. While on the subject of conversions, it must have been around this time that Alan Robinson induced a mathematics professor in Syracuse by the name of Ken Bowen to join the computer science department and logic programming research. Research in logic programming was bolstered further as a result of visits by Kowalski, Clark, and Steve Gregory.

On one of Kowalski's visits to Syracuse, in the spring of 1980, a workshop was organized on Lake Cazenovia, South of Syracuse. This seems to have launched logic programming in the US beyond Syracuse.

During the Syracuse visit Bob came to Waterloo. I showed him how I used Prolog to keep track of the grades of the courses I was teaching. The point is not that it was something that could be done in Prolog, but that, whether sitting in my bedroom or in my office, this was the most convenient way I knew of getting the chore done. This may have made an impression on Bob: Prolog not as a flawed attempt at logic programming, but as a programming language. I asked him: "If you want a better logic programming language, what better language to implement it in than Prolog?" It was the blessing of Waterloo Prolog on a fast IBM 370 that allowed one to think this way.

Kowalski's procedural interpretation of resolution inference should have been a sufficient answer to Seymour Papert's harangues about resolution being doomed to ineffectiveness on the ground of being a "uniform proof procedure". Apparently Papert's voice was still echoing in my mind when I wanted to underline the point by extending this to a "flowchart interpretation" of logic, the idea being that flowcharts are lower-level programs than those with procedures.

Floyd's method for verification of flowcharts is a good starting point. His assertions become predicates and the verification conditions become (tail-recursive) clauses. Hence papers such as "Verification conditions as programs" in 1976 and "Consequence Verification of Flowcharts" with Keith Clark in 1981. I also applied the idea to programming in a conventional language ("Programming with Verification Conditions"). Before writing any code, you decide what assertions are going to be needed. Then you connect them by verification conditions. In this way the code is added. Just as with clauses in a logic program, a verification condition only gets added if it is true. In this way I wrote my only Fortran program, a non-recursive quicksort. It did not appear to need any debugging the first time I ran it.

The 1977 Saint Andrews meeting had introduced me to the idea of dataflow. This was to provide me with the handle on parallel execution of Prolog procedure calls, that had eluded me in 1973 on the train back from Marktoberdorf. In December 1979 I was part of the invigilator squad with a thousand undergraduates writing various exams in the hangar-like gym of the University of Waterloo.

On such occasions the trick was to find an alternative to unmitigated boredom for three hours. Reading does not work: one has to look attentive. It turns out that writing is better, the more challenging, the better. By the end of the exam I had a logic program with goals executing in parallel and sharing terms acting like dataflow channels. This became the paper with Gentil de Lucena.

Afterword

Some workers, such as McCarthy and Hayes, consider AI as a kind of experimental philosophy. In the philosophy of science, it took almost a hundred years to put the final nails in the coffin of J.S. Mill's doctrine of induction. About halfway, when Peirce tried to make sense of induction, he introduced abduction. AI has clarified the issues in two stages. The MIT school of default reasoning was the model-building stage. What logic programming provided was not only a logical analysis, but also an implementation.

"The Hundred Year Language", a web-published essay by Paul Graham, points out that, contrary to what is the case with computers, programming languages evolve slowly. This is because computers are technology, which changes fast. Programming languages represent ways of thinking, which change slowly, if at all. Languages that are not vehicles for thought, Graham says, are doomed anyway, though they may be widely used today.

According to Graham's criteria, the way of thinking represented by that mysterious combination of Prolog and logic programming is a good candidate for being around in a hundred years. This is not used as example by Graham, but it satisfies his criteria nonetheless.

A hint of an early step along the path such an evolution might take is given by what can be called "relational technologies". Current examples of these technologies are relational databases, constraint programming, spreadsheets, and logic programming. The time may be ripe for these to be unified into something one could call relational processing. Many applications are rich in data. At present they are tackled by relational databases cobbled together by SQL and various scripting languages. They cry out for something like relational processing.

Foreword

The Story

Afterword