[deptcw, bongard, registration]
begin(background). ml_system(claudien). ml_system(icl). end(background). begin(model(wimv)). male. assistent. implements(icl). end(model(wimv)). begin(model(ldh)). male. assistent. implements(claudien). end(model(ldh)). begin(model(wiske)). female. student. implements(chess). end(model(wiske)). begin(model(suske)). male. student. implements(minesweeper). end(model(suske)).
dlab_template('0-len:[male,female, implements(Program),ml_system(Program)] <-- 0-len:[male,female, implements(Program),ml_system(Program)]').
classes([student, assistent]). talking(4). significance_level(0).
(after running icl and write_theory)
theory( assistent , complete , dnf , [rule('(implements(Program),ml_system(Program))', [type(dnf),cpu(0.17),heur(0.75),local(2,0,0,2),total(2,0,0,2)]) ]). theory( student , complete , dnf , [rule('(implements(Program),\+ml_system(Program))', [type(dnf),cpu(0.15), heur(0.75),local(2,0,0,2),total(2,0,0,2)]) ]).
The Bongard problem number 47 is used as an example in most ICL papers (developed by the Russian scientist M. Bongard). It consists of 12 figures (see below), six of class pos and six of class neg. The goal is to discriminate between the two classes.
One possible way to put these figures in a KB file is given (others exist!).
begin(model(1)). pos. triangle(f1). small(f1). up(f1). circle(f2). large(f2). in(f1,f2). end(model(1)). begin(model(2)). pos. triangle(f1). small(f1). up(f1). circle(f2). medium(f2). circle(f3). small(f3). triangle(f4). large(f4). up(f4). in(f1,f2). end(model(2)). begin(model(3)). pos. triangle(f1). small(f1). down(f1). circle(f2). large(f2). small(f3). triangle(f3). up(f3). in(f3, f2). end(model(3)). begin(model(4)). pos. circle(f1). medium(f1). triangle(f2). medium(f2). down(f2). circle(f3). large(f3). circle(f4). small(f4). triangle(f5). small(f5). down(f5). circle(f6). small(f6). in(f5, f3). end(model(4)). begin(model(5)). pos. triangle(f1). medium(f1). down(f1). triangle(f2). small(f2). down(f2). triangle(f3). large(f3). right(f3). circle(f4). medium(f4). triangle(f5). small(f5). left(f5). in(f5,f4). end(model(5)). begin(model(6)). pos. triangle(f1). large(f1). up(f1). circle(f2). medium(f2). triangle(f3). small(f3). up(f3). in(f3,f2). end(model(6)). begin(model(7)). neg. circle(f1). small(f1). triangle(f2). large(f2). up(f2). in(f1,f2). end(model(7)). begin(model(8)). neg. triangle(f1). large(f1). left(f1). circle(f2). medium(f2). circle(f3). small(f3). triangle(f4). medium(f4). up(f4). in(f3,f4). end(model(8)). begin(model(9)). neg. circle(f1). large(f1). triangle(f2). medium(f2). up(f2). circle(f3). small(f3). in(f3,f2). end(model(9)). begin(model(10)). neg. triangle(f1). small(f1). down(f1). triangle(f2). medium(f2). down(f2). triangle(f3). small(f3). up(f3). circle(f4). small(f4). in(f4,f2). end(model(10)). begin(model(11)). neg. circle(f1). small(f1). circle(f2). large(f2). circle(f3). medium(f3). circle(f4). small(f4). triangle(f5). medium(f5). up(f5). in(f4,f5). end(model(11)). begin(model(12)). neg. circle(f1). small(f1). circle(f2). small(f2). circle(f3). small(f3). circle(f4). small(f4). triangle(f5). small(f5). up(f5). triangle(f6). medium(f6). up(f6). triangle(f7). small(f7). right(f7). in(f3, f6). end(model(12)).
dlab_template(' len-len:[0-len:[sort(X), sort(Y), size(X), size(Y), direction(X), direction(Y), in(X,Y), in(Y,X)] ] <-- len-len:[1-len:[sort(X), sort(Y)], 0-len:[size(X), size(Y), direction(X), direction(Y), in(X,Y), in(Y,X)]] '). dlab_variable(sort,1-1,[circle,triangle]). dlab_variable(size,1-1,[large,medium,small]). dlab_variable(direction,1-1,[down,left,right,up]).
significance_level(0).
theory( neg , complete , dnf , [rule('(triangle(Y),in(X,Y))', [type(dnf),cpu(1.53),heur(0.875),local(6,0,0,6),total(6,0,0,6)]) ]). theory( pos , complete , dnf , [rule('(triangle(Y),in(Y,X))', [type(dnf),cpu(1.4),heur(0.875),local(6,0,0,6),total(6,0,0,6)]) ]).
We have an auto generated Bongard problem with more than 300 figures (see figure below). The notation in the KB file is a bit different w.r.t. the above example. We give you the KB file. Make an appropriate language file, and see if you can find some regularities.
This example has been used in a tutorial (Three companions for first order data mining). Imagine that you have just been hired by a professional seminar organizer (PSO) in order to discover new knowledge about the activities of the PSO that is to be used for commercial purposes. PSO also informs you that they have a database about past activities. Part of this database is listed below.
It contains information about participants in a recent Seminar on Data Mining. Note that information about each person is contained in multiple tables of the database. To obtain a set of examples for ICL, we partition the the database into examples. Global information (like course table) is put in the background.
We would like to find out what type of people attend the parties at our seminar (can be useful in order to set the price of the party as well as to decide upon the activities at parties).
Below are given the input files. Try and run ICL on it!
begin(background). company(jvt,commercial). company(scuf,university). company(ucro,university). course(cso,2,introductory). course(erm,3,introductory). course(so2,4,introductory). course(srw,3,advanced). job(_J):- participant(_J, _, _, _). company(_C):- participant(_, _C, _, _). party(_P):- participant(_, _, _P, _). company_type(_T):- company(_C), company(_C, _T). course_len(_C, _L):- course(_C, _L, _). course_type(_C, _T):- course(_C, _, _T). end(background). begin(model(adams)). participant(researcher,scuf,no,23). subscription(erm). subscription(so2). subscription(srw). end(model(adams)). begin(model(blake)). participant(president,jvt,yes,5). subscription(cso). subscription(erm). end(model(blake)). begin(model(king)). participant(manager,ucro,no,78). subscription(cso). subscription(erm). subscription(so2). subscription(srw). end(model(king)). begin(model(miller)). participant(manager,jvt,yes,14). subscription(so2). end(model(miller)). begin(model(scott)). participant(researcher,scuf,yes,94). subscription(erm). subscription(srw). end(model(scott)). begin(model(turner)). participant(researcher,ucro,no,81). subscription(so2). subscription(srw). end(model(turner)).
dlab_template(' 0-2: [ job(c_job), company_type(c_company_type), subscription(_S), course_len(_S,c_course_len), course_type(_S,c_course_type) ] <-- 0-len: [ job(c_job), company_type(c_company_type), subscription(_S), course_len(_S,c_course_len), course_type(_S,c_course_type) ] '). dlab_variable(c_job, 1-1, [researcher,manager,president]). dlab_variable(c_company_type, 1-1, [commercial,university]). dlab_variable(c_course_len, 1-1, [2,3,4]). dlab_variable(c_course_type, 1-1, [introductory,advanced]).
classes([party(no), party(yes)]).
theory( party(no) , complete , dnf , [rule('(company_type(university), subscription(_S), course_len(_S,4))',[type(dnf),cpu(1.2),heur(0.8),local(3,0,0,3),total(3,0,0,3)]) ]).
Try setting significance level to 0.8 and see if you can find a theory for party(yes).
Copyright 1998, Katholieke Universiteit Leuven, dept. Computerwetenschappen Information provider: KULeuven dept. Computerwetenschappen Comments for the authors: Wim Van Laer Page design: Wim Van Laer URL: http://www.cs.kuleuven.ac.be/example.html |