Prolog Programming on the Web

Appeared in Volume 10/1, February 1997

Keywords: web programming.

das@nicklas.franken.de
David Sedlock
6th November 1996

At the moment, there is a lot of interest in making Prolog applications available over the Web. One obstacle is the relatively bulky executable that most Prolog compilers deliver. If you use the standard CGI architecture, you pay a big price at runtime to load up such an executable every time the application is called. As it stands, the response time of a Prolog application over the net is likely to reinforce the general opinion that Prolog is too slow for serious use.

In principle, I suppose it is possible to develop your own Web server with the Prolog application inside it. This would avoid the repeated load-up costs. However, writing a server is non-trivial, and is best left to experts. So we seem to be back at the CGI architecture.

Some servers offer a different solution. I am thinking primarily of Apache, but I believe the Netscape server provides something similar. Apache has a modular structure and allows you to link your own program into the executable. The program could replace the standard URL handler, so that requests for certain URLs invoke the Prolog program.

All of this is at the 'in principle' level. The devil is in the detail. I am very interested to know if anyone has done this.

tom@rahul.net
Tom Howland
7th November 1996

I think the clever way of doing this is to have a Prolog server that is interacted with via Java applets or CGI scripts. You don't need to link the Prolog application into the Web server.

ferguson@tc.pw.com
Don Ferguson
7th November 1996

I have done this. The Prolog process is a server that listens for requests on a fixed port (using Tom Howland's fine TCP Library, distributed with Quintus Prolog). The cgi-bin program is a small C executable that connects to the port, stuffs a goal down the wire, and writes the resulting HTML to stdout.

I have also tried making Java connect directly to the Prolog socket. This works, but some firewalls block non-http traffic, so I switched to having the applet communicate with Prolog via the cgi-bin program (In Java, you can open and read from arbitrary URL).

If anyone is curious, check out http://edgarscan.tc.pw.com, which provides access to the financial data of public US corporations with data (extracted by Prolog) from the SEC's EDGAR database. You'll be asked to register, but it doesn't cost anything.

tarau@clement.info.umoncton.ca
Paul Tarau
7th November 1996

Server-side Prolog code is important as a way of storing (possibly persistent) 'state', but the performance issue is tricky. One solution is to use Prologs which compile to C (e.g. BinProlog, WAMCC) to generate CGI scripts 'in C'. These compete quite well with interpreted Perl in terms of speed. Prologs which generate small stand-alone executables are also usable for writing CGI scripts. Mercury generates even faster and often smaller executables through compilation to C.

Also, if the CGI scripts use shared dynamically linked libraries then their invocation overheads can be reduced drastically.

Linking the Prolog system into the server is not the only way to go.

Think also about the usual maintenance nightmare: what if the server gets updated? What if your Prolog gets updated? Will they still work together?

Moreover, a Prolog built into the server will still have to parse and interpret/compile your source script. It might turn out that the overall cost of this is larger than a small CGI script generated by a Prolog which compiles to C.

If your Prolog compiles to Java (see jProlog) or is written in Java (e.g. Minerva) you can use both client and server side Java as a way of integrating your Prolog code into an Internet application. jProlog and Minerva are both in a prototype stage at the moment. jProlog (which is free) might be a good starting point for writing Internet Prolog code in a 'radical' fashion. It also allows you to attach Prolog actions to VRML 2.0 nodes.

Today, the quickest way to include a Prolog program as part of an Internet application is to use a CGI-friendly Prolog system and a Prolog CGI script. This is close to the way people working with Perl proceed usually.

With BinProlog 5.25 this is quite simple: you put the binary in your cgi-bin and add a line in your HTML file calling it with a *.pro script on the command line. Working with a source level script makes debugging easier and shortens development cycle. For fast 'production' versions, you simply compile the Prolog script to C and put the resulting executable in the cgi-bin directory.

Look at my home page at http://clement.info.umoncton.ca/~tarau for a trivial example (a counter). A more complex application, developed with CGI scripts and a multi-threaded Linda server (LogiMOO) can be retrieved from the same place.

oori@gilo.jlm.k12.il
Oori Hasson
8th November 1996

My own Prolog CGI script works quite fast. Also, if it does grow larger, it is easy to convert it to server taking requests on a socket. Quite a few Prologs have predicates for manipulating TCP sockets.

My Prolog CGI script can be called via:
http://www.gilo.jlm.k12.il/~oori/har.html

It implements a semantic network in front of a search engine which answers Internet queries. Although the main Web page is in Hebrew, the technical information is in English

mary@amzi.com
Mary Kroening
8th November 1996

We have an Amzi! Prolog shell called WebLS for giving advice and solving problems. It runs as a CGI program under BSDI Unix, Solaris (internally), and Windows. The executable for BSDI is 219kb for the Prolog engine and CGI interface. The executable Prolog code is another 39kb. The Windows version uses a DLL, which remains loaded across invocations by the Web server.

David Sedlock writes:
In principle, I suppose it is possible to develop your own Web server...

There are a couple of other approaches. Some of our Windows customers have used DDE to create a very small CGI invocation that issues requests to a Prolog engine that is always running. Also Netscape's NSAPI and Microsoft's ISAPI allow dynamic libraries to remain loaded. And Netscape's plugins and Microsoft's ActiveX make it possible to have a client-side Prolog engine and application. ActiveX also offers the promise of being able to use Prolog on Web pages as a scripting language. With Netscape, it is possible to have a Prolog engine as a Java class with native methods, making it accessible from JavaScript and Java Applets.

David Sedlock continues:
Apache has a modular structure and allows you to link your own program into the executable

A lot of people don't have this level of access to their servers because they use Internet Sevice Providers.

He continues:
I suppose a thousand things can go wrong...

WebLS is able to catch any runtime errors in Prolog and reports them back through an error screen. That same mechanism is used to return 'expected' errors, like errors in the syntax of the WebLS rules. A catch/throw mechanism is very handy for this.

In general, connecting Amzi! to CGI was rather straight forward. However, the Netscape mechanisms for making plugins are much more complicated. The Java classes have to be properly installed, the Prolog engine DLL has to be in the right place, the plugin has to be installed properly, etc. Also the JDK from Sun provides a different interface to the Java runtime than the JRI from Netscape. But that seems normal for a techology that is under rapid development.

If you'd like to learn more, see our web site:
http://www.amzi.com

lee@cs.mu.oz.au
Lee Naish
11th November 1996

Isn't the network more important than how fast a Prolog process starts? I just pointed my browser at Paul Tarau's page to check out the response time. After several tries all I got was "A Network Error has Occurred".

If and when I manage to connect, I suspect the network bandwidth and latency will be more important than the time taken to start up a reasonable Prolog system on a modern machine. I'm not knocking the noble goal of efficient system software, but unless the network is very fast, and the machine/Prolog very slow, it's not a big deal.

pereira@research.att.com
Fernando Pereira
11th November 1996

You're looking at it from single client's perspective, which is fine for measuring response latencies for single connections to a lightly loaded server. But if you look at it from the server's perspective, those relatively costly process start-ups soon add up to real load if the server is popular. And an overloaded server is one that eventually slows down to a crawl, times out, or runs out of resources, creating the kinds of problems you saw when accessing Paul's pages.

Don Ferguson and others have pointed out that the right way for serving the Web with complex software applications, both for perfomance and maintainability reasons, is to create a lightweight CGI stub that connects to a persistent version of the application. That's often how databases are used for Web service. For instance, Oracle's Web server tools do not run the database system directly, but instead connect to the standard Oracle server.

das@nicklas.franken.de
David Sedlock
12th November 1996

Webs are often internal (Intranets), where bandwidth and latency are not so crucial to reponse time. When Intranet users find that a slow CGI program is implemented in Prolog, they tend to think it's the fault of Prolog. That's why I'd like to eliminate the overhead of loading the executable at every request. On our system, that adds 1-3 seconds to each request.

Other people have suggested using a small CGI program to communicate with a permanently running Prolog engine. But think of having to service tens, hundreds, or thousands of requests per minute. This one engine is going to become a bottleneck. Web servers solve this problem by pre-forking request handlers. When a new request arrives, it is shunted to an idle request handler. Sure, you can implement all of that yourself, but it isn't fun, and is better left to people that are good at it. And why do it all again?

euajanne@eua.ericsson.se
Jan Andersson
16th November 1996

David Sedlock writes:
This one engine is going to be a bottleneck...

Have you considered the Linda library that comes with SICStus? I believe your approach can be implemented quite easily with the help of Linda.

carp@research.bell-labs.com
Bob Carpenter
17th September 1996

If anyone is interested in code for CGI handling in Prolog (URL parsing, etc.), you can find it at:
http://macduff.andrew.cmu.edu/cgparser

This page has links to a theorem prover implemented in Prolog (frame based and non-frame based versions), as well as to Prolog code for handling CGI input/output.

fruehwir@informatik.uni-muenchen.de
Thom Fruehwirth
20th November 1996

The proceedings of the 1st Workshop on Logic Programming Tools for Internet Applications at JICSLP'96 are archived at the following sites:

http://clement.info.umoncton.ca/~lpnet/lp-internet/archive.html
http://www.elis.rug.ac.be/ELISgroups/paris/lp-internet/archive.html
http://www.clip.dia.fi.upm.es/miscdocs/lp-internet/archive.html
http://www.cs.mu.oz.au/~ad/lp-internet/archive.html

Also, on December 19th - 20th there is going to be a LP and Internet Workshop at Imperial College. Details can be found at:
http://www-lp.doc.ic.ac.uk/lp-internet.html

We have developed the Munich Rent Advisor, a forms-based expert system written entirely in the constraint LP language ECLiPSe. It uses a one-page Web server specialising on forms. Details can be found at:
http://www.pst.informatik.uni-muenchen.de/personen/fruehwir/cwg.html

ad@ratree.psu.ac.th
Andrew Davison
29th November 1996

There seem to be four ways of integrating the Web and LP/Prolog on the server side:

For each query, invoke a LP/Prolog script via CGI.
Advantage: it's easy, with plenty of support libraries. A possible problem may be the load caused by invoking all those scripts. But this must be balanced by the network/machine speeds, the expected usage, and the speed/size of the LP/Prolog executable. There are fast versions of CGI, and very fast compilers/generators of small executables for Prolog. The size of the executable may be misleading in any case, if the libraries stay in memory once loaded.
Invoke an interface script for each query which communicate with a single long lasting LP/Prolog process.
Advantage: only one LP/Prolog executable using resources. The drawback of this approach is that the process must be able to deal with multiple queries at once, therefore suggesting some kind of parallel LP system. One way of viewing option (2) is as a partial migration of the server's functionality into the LP/Prolog component, which reaches its conclusion in option (3)...
Build your own LP/Prolog Web server.
Advantage: complete control over the client-server interaction, which means things can be tuned for the particular application. Drawback: lots of work, but there are the Eclipse server library and TCP libraries for Prolog.
Incorporate the LP/Prolog system into the server.
I've only heard of this being possible with Apache (?). Does anyone have some pointers to technical details? Obvious drawbacks: requires system admin, status, and it isn't portable across servers.

Let's look at client-side solutions.

Java has been mentioned, and Prolog can be integrated with it in quite a few ways:

You can compile Prolog to Java byte-codes (e.g. MINERVA, jProlog).
You can have a Prolog interpreter written in Java (WProlog).
You can call Prolog as a Java class (via native methods) (Amzi!).
You can talk to Prolog via sockets (Don Ferguson gave up on this one (?)).

Finally, to our solution (Seng Wai Loke's and mine) called LogicWeb.

Abstract of our forthcoming paper:
---
LogicWeb is a client-side logic programming tool for the Web, which is particularly suitable for coding important classes of Web applications. We have identified three domains so far: structured information processing, search, and parsing. The key reason for LogicWeb's utility is that it offers an abstract view of the Web, which replaces Web pages and hypertext links by logic programming modules and relationships between modules.

LogicWeb illustrates that logic programming possesses many advantages for writing Web applications, including the simple representation of information (e.g. as deductive databases or as logic grammars), the ability to write meta-level descriptions (e.g. of pages and the connections between pages), and the encoding of rules and heuristics necessary for ``intelligent'' behaviour.

This paper introduces LogicWeb, and considers two application areas in some detail: Web search and the representation of Web information as databases. There is also a discussion of LogicWeb's implementation, and of related work.
---

There are a number of papers on LogicWeb, which can be accessed from:
http://www.cs.mu.oz.au/~swloke/logicweb.html

Anyone who would like a copy of the long paper can also contact me.

inaf@inaf.com
Nathan Finstein
30th November 1996

Andrew Davison wrote:
2. Invoke an interface script for each query which communicate with a single long lasting LP/Prolog process.
...
The drawback of this approach is that the process must be able to deal with multiple queries at once, therefore suggesting some kind of parallel LP system...p>

If the interface "script" and Prolog process coordinate their comunications so that the Prolog process finishes one transaction before receiving input for the next transaction, then Prolog processes transactions one at a time. Parallel would still be nice for some purposes.