Probabilistic-Programming Datalog (PPDL) is an extension of Datalog with probabilities proposed in Barani et al., Declarative Statistical Modeling with Datalog, arxiv 2015. We’ll model some of their examples.
Introduction: Clients and Branches
Here’s a ProbLog variant of their motivating example about clients visiting branches of a service provider, replacing the Poisson with a finite distribution to keep support finite.
% input relations (filled arbitrarily)
% client(id, branch, avg_num_of_visits)
client(a,b1,2).
client(b,b2,2).
client(c,b2,3).
% preferred_client(id, branch, avg_num_of_visits)
preferred_client(a,b1,2).
preferred_client(b,b2,2).
% active(branch)
active(b1).
% fixed distribution centered around the average M (assumes M >=2)
0.1::dist(C,B,M,MM2) ; 0.2::dist(C,B,M,MM1) ; 0.4::dist(C,B,M,M) ; 0.2::dist(C,B,M,MP1) ; 0.1::dist(C,B,M,MP2) :- MM2 is M-2, MM1 is M-1, MP1 is M+1, MP2 is M+2.
% directly modeling the rules; note that they all share the same distribution (which is what the paper wants to achieve)
visits(C,B,X) :- client(C,B,L), dist(C,B,L,X).
visits(C,B,X) :- preferred_client(C,B,L), dist(C,B,L,X).
visits(C,B,X) :- client(C,B,L),active(B), dist(C,B,L,X).
query(visits(_,_,_)).
% this is a ProbLog variant of the transformed program in the paper
% that explicitly makes intermediate steps
% 1. define the shared cause for sampling
ex_y_visits_flip(C,B,L) :- client(C,B,L).
ex_y_visits_flip(C,B,L) :- preferred_client(C,B,L).
ex_y_visits_flip(C,B,L) :- client(C,B,L),active(B).
% 2. sample (this is implicit in their translation)
visits_flip(C,B,X,L) :- ex_y_visits_flip(C,B,L), dist(C,B,L,X).
% 3. project out the parameter
visits_hat(C,B,X) :- visits_flip(C,B,X,L).
query(visits_hat(_,_,_)).
Here’s a variant that illustrates the noisy-or effect in the case where each rule uses its own distribution.
% input relations (filled arbitrarily)
% client(id, branch, avg_num_of_visits)
client(a,b1,2).
client(b,b2,2).
client(c,b2,3).
% preferred_client(id, branch, avg_num_of_visits)
preferred_client(a,b1,2).
preferred_client(b,b2,2).
% active(branch)
active(b1).
% fixed distribution centered around the average M (assumes M >=2)
0.1::basic(C,B,M,MM2) ; 0.2::basic(C,B,M,MM1) ; 0.4::basic(C,B,M,M) ; 0.2::basic(C,B,M,MP1) ; 0.1::basic(C,B,M,MP2) :- MM2 is M-2, MM1 is M-1, MP1 is M+1, MP2 is M+2.
0.1::pref(C,B,M,MM2) ; 0.2::pref(C,B,M,MM1) ; 0.4::pref(C,B,M,M) ; 0.2::pref(C,B,M,MP1) ; 0.1::pref(C,B,M,MP2) :- MM2 is M-2, MM1 is M-1, MP1 is M+1, MP2 is M+2.
0.1::active(C,B,M,MM2) ; 0.2::active(C,B,M,MM1) ; 0.4::active(C,B,M,M) ; 0.2::active(C,B,M,MP1) ; 0.1::active(C,B,M,MP2) :- MM2 is M-2, MM1 is M-1, MP1 is M+1, MP2 is M+2.
% directly modeling the rules; note that they all share the same distribution (which is what the paper wants to achieve)
visits(C,B,X) :- client(C,B,L), basic(C,B,L,X).
visits(C,B,X) :- preferred_client(C,B,L), pref(C,B,L,X).
visits(C,B,X) :- client(C,B,L),active(B), active(C,B,L,X).
query(visits(_,_,_)).
Earthquakes
% the edb from figure 3: different types of properties in cities with different burglary rates
home(np1,napa).
home(np2,napa).
home(yc1,yucaipa).
business(np3,napa).
business(yc1,yucaipa).
city(napa,0.03).
city(yucaipa,0.01).
% ProbLog encoding of rules in figure 1
0.01::earthquake(C,true); 0.99::earthquake(C,false) :- city(C,R).
unit(H,C) :- home(H,C).
unit(B,C) :- business(B,C).
R::burglary(X,C,true) ; RR::burglary(X,C,false) :- unit(X,C), city(C,R), RR is 1-R.
0.6::trig(X,true) ; 0.4::trig(X,false) :- unit(X,C),earthquake(C,true).
0.9::trig(X,true) ; 0.1::trig(X,false) :- burglary(X,C,true).
alarm(X) :- trig(X,true).
query(earthquake(_,_)).
query(burglary(_,_,_)).
query(trig(_,_)).
query(alarm(_)).
% ProbLog encoding of transformed rules in figure 2
exY_earthquake_flip_2(C,y,0.01) :- city(C,R).
unit_hat(H,C) :- home(H,C).
unit_hat(B,C) :- business(B,C).
exY_burglary_flip_3(X,C,y,R) :- unit_hat(X,C),city(C,R).
exY_trig_flip_2(X,y,0.6) :- unit_hat(X,C), earthquake_hat(C,true).
exY_trig_flip_2(X,y,0.9) :- burglary_hat(X,C,true).
alarm_hat(X) :- trig_hat(X,true).
earthquake_hat(C,D) :- earthquake_flip_2(C,D,P).
burglary_hat(X,C,Y) :- burglary_flip_3(X,C,Y,R).
trig_hat(X,Y) :- trig_flip_2(X,Y,P).
P::earthquake_flip_2(C,true,P) ; PP::earthquake_flip_2(C,false,P) :- exY_earthquake_flip_2(C,y,P), PP is 1-P.
R::burglary_flip_3(X,C,true,R); RR::burglary_flip_3(X,C,false,R) :- exY_burglary_flip_3(X,C,y,R), RR is 1-R.
P::trig_flip_2(X,true,P) ; PP::trig_flip_2(X,false,P) :- exY_trig_flip_2(X,y,P), PP is 1-P.
query(earthquake_hat(_,_)).
query(burglary_hat(_,_,_)).
query(trig_hat(_,_)).
query(alarm_hat(_)).
Infinite examples
% use typing to restrict to finite set of queries
dom(X) :- between(0,5,X).
% program 1 (single infinite model)
% r(X,f(X)) :- r(Y,X) with f(X) = 2X with prob 1
% r(0,1)
% ProbLog style
r(0,1).
1.0::r(X,X2) :- r(_,X), X2 is 2*X, dom(X).
% closer to delta-notation in paper
s(0,1).
s(Y,DY) :- s(_,Y), delta(Y,DY).
1.0::delta(Y,DY) :- dom(Y), DY is 2*Y.
query(r(_,_)).
query(s(_,_)).
% use typing to restrict to finite set of queries
dom(X) :- between(0,5,X).
% program 2 (one finite, one infinite model)
% same as program 1, but start with (0,1) or (0,0)
% ProbLog style
0.5::r(0,0); 0.5::r(0,1).
1.0::r(X,X2) :- r(_,X), X2 is 2*X, dom(X).
% closer to delta-notation in paper
s(0,F) :- flip(0.5,F).
s(Y,DY) :- s(_,Y), delta(Y,DY).
1.0::delta(Y,DY) :- dom(Y), DY is 2*Y.
0.5::flip(0.5,0); 0.5::flip(0.5,1).
query(r(_,_)).
query(s(_,_)).
% use typing to restrict to finite set of queries
dom(X) :- between(0,5,X).
% program 3 (all models infinite)
% as program 1, but splitting on 2X / 2X+1
% ProbLog style
r(0,1).
0.5::r(X,X2); 0.5::r(X,X3) :- r(_,X), X2 is 2*X, X3 is X2+1, dom(X).
% closer to delta-notation in paper
s(0,1).
s(Y,DY) :- s(_,Y), delta(Y,DY).
0.5::delta(Y,DY); 0.5::delta(Y,DY1) :- dom(Y), DY is 2*Y, DY1 is DY+1.
query(r(_,_)).
query(s(_,_)).
% use typing to restrict to finite set of queries
dom(X) :- between(0,5,X).
% program 4 (all models infinite except for one)
% combining programs 2 and 3
% ProbLog style
0.5::r(0,0); 0.5::r(0,1).
0.5::r(X,X2); 0.5::r(X,X3) :- r(_,X), X2 is 2*X, X3 is X2+1, dom(X).
% closer to delta-notation in paper
s(0,F) :- flip(0.5,F).
0.5::flip(0.5,0); 0.5::flip(0.5,1).
s(Y,DY) :- s(_,Y), delta(Y,DY).
0.5::delta(Y,DY); 0.5::delta(Y,DY1) :- dom(Y), DY is 2*Y, DY1 is DY+1.
query(r(_,_)).
query(s(_,_)).