How to Benchmark Prolog?

Appeared in Volume 8/1, February 1995

Keywords: benchmarks.

sby@cs.usask.ca
S.Bharadwaj Yadavalli
10th September 1994

Are there standard benchmarking programs somewhere on an FTP site that can be used to time the performance of a compiler? I did get hold of the benchmark programs used in Peter Van Roy's work on Aquarius (the published ones) via Beta Prolog sources.

In this context, how can I measure and compare the performance of the compiler on a host of "small" programs like nreverse/2 (other than run it on a huge list of 200 elements) in terms of a more accurate measure than the "nicely rounded off" user times in milliseconds. Further, the problem is that these programs run in such small times (10 - 20 ms) in the interpreter mode (on my machines at least) that I can not see/measure any difference in execution speed on compiling them. How did the folks who did these manage? Can someone help me by clarifying this?

pereira@alta.research.att.com
Fernando Pereira
10th September 1994

The basic technique for timing short-running programs is to time a backtrack loop calling the benchmark N times, for reasonable N, and subtract from that total time the time for the same loop running N times but calling a no-op. This is done in the following code fragment from a benchmark suite that many people (myself, Richard O'Keefe, Paul Wilk, David H. D. Warren, various people at ICOT, probably others) contributed to:

bench_mark(Name) :-
  bench_mark(Name, Iterations, Action, Control),
  get_cpu_time(T0),
  (  repeat(Iterations), call(Action), fail
  ;  get_cpu_time(T1)
  ),
  (  repeat(Iterations), call(Control), fail
  ;  get_cpu_time(T2)
  ),
  write(Name), write(' took '),
  report(Iterations, T0, T1, T2).

mcovingt@ai.uga.edu
Michael Covington
10th September 1994

You will probably get a comprehensive reply from Fernando Pereira, who wrote a suite of benchmarks for 'Computer Language' magazine a while back. They are available by FTP from somewhere.

A further caution: Some benchmarks, such as "naive reverse," leave some of the most important properties of Prolog (such as backtracking) unused.

conway@mundil.cs.mu.oz.au
Thomas Charles Conway
15th September 1994

Unfortunately, it doesn't necessarily mean much to look at the benchmarks of language implementations carried out on any machine but your own. Here at Melbourne, we have been benchmarking our new compiler, and we get some very weird anomolies between machines. The causes of these strange behaviours (such as things getting slower when we use machine registers for storage instead of global variables) are probably due to caching effects, and possibly in some cases pipelining and super-scalarity effects. The only way to see which language implementation is going to run your program the fastest is to port it to all the contenders, and try it.