Session 4: Learning

Exercise: Bayesian learning

In this example, we want to learn the probability that a given is fair or not. We can do this by using conditional inference.

0.999::fair_coin. trick_coin :- \+fair_coin. 0.5::coin_flip(h,N); 0.5::coin_flip(t,N) :- fair_coin. 0.95::coin_flip(h,N); 0.05::coin_flip(t,N) :- \+fair_coin. observe([],0). observe([Face|T], N) :- coin_flip(Face,N), N1 is N-1, observe(T, N1). query(fair_coin). query(trick_coin). evidence(observe(S, N)) :- observed_sequence(S), length(S,N). observed_sequence([h,h,h,h,h,h,h,h,h,h,h,h,h]).

Play around with the prior at the top and the length of the observed sequence.

Exercise: Parameter learning using EM

Below we learn the parameters of the Alarm program.

When learning, the probability annotation in a probabilistic fact can be one of three possible forms.

  • Of the form t(_), as in for instance t(_)::alarm. This indicates that the probability of this fact is to be learned from data. ProbLog2 uses an EM algorithm to perform parameter learning. In the first iteration of EM, each probability is initialized to a random value. Note that t is an abbreviation of ‘tunable’.
  • Of the form t(p), with p a probability, as in for instance t(0.5)::burglary. This again indicates that the probability of this fact is to be learned from data, but instead of initializing this probability randomly, it will be set to the value p in the first iteration of EM.
  • Of the form p, with p a probability, as in for instance 0.2::earthquake. This indicates that the probability of this fact is fixed (not learned).

Apart from the input field for the program, the learning interface has a second input field that is used for supplying the data. The data is a set of partial interpretations separated by “—” lines (see below). Each partial interpretation is given in the “evidence” format as also used for inference. Below we use a dataset consisting of three partial interpretations. Not all atoms need to be present in an interpretation as can be observed in this example. This is thus an instance of learning from incomplete data.

%%% The program: t(0.5)::burglary. 0.2::earthquake. t(_)::alarm :- burglary, earthquake. t(_)::alarm :- burglary, \+earthquake. t(_)::alarm :- \+burglary, earthquake.
%%% The data: evidence(burglary,false). evidence(alarm,false). ----- evidence(earthquake,false). evidence(alarm,true). evidence(burglary,true). ----- evidence(burglary,false).

When pressing ‘Evaluate’, ProbLog2 starts learning and when finished returns the learned probabilities. Note that these probabilities might vary from run to run because they depend on the random initialization used in the first EM iteration.

You can play around with the data, and see the effect of adding more data.

If you want to perform learning using a locally installed version of ProbLog you can use:

$ problog lfi -O