Interestingness :
Condensed repres. :
Constraint combos :
Modeling variants :
The FIM_CP System, constraint-based itemset mining.
FIM_CP is the Frequent Itemset Mining system that uses Constraint Programming. It's a very expressive system, in which many of the itemset mining constraints can be formulated in off-the-shelve CP constraints. The systems comes with several models which show how to do this.
It uses Gecode as underlying constraint solver, which is the Generic Constraint Development Environment: an open and efficient solver written in C++.
Download latest version: FIM_CP 2.7
Read the Download and Installation instructions.
Questions and bugreports can be sent to tias.guns@cs.kuleuven.be
When writing a publication that uses this software, please cite the A.I. Journal paper::
T. Guns, S. Nijssen, L. De Raedt. Itemset mining: A constraint programming perspective, Artificial Intelligence 175(12-13), 2011,
Or alternatively:
L. De Raedt, T. Guns, S. Nijssen. Constraint programming for itemset mining, KDD 2008.
The Models
The models are divided in several categories, depending on their constraints:- Interestingness: Models that use different interestingness measures. The standard model uses frequency. The discriminating model uses frequency and infrequency on positive and negative transactions. The emering model uses the fraction of positive and negative transactions covered.
- Condensed representations: Models that do not output all frequent patterns, but a condensend representation. We included a model for closed, delta-closed and maximal itemsets.
- Item constraints: Models with an extra constraint on a property of the itemset itself. Examples include constraints on the size, cost or average cost of an itemset. Discriminating constraints can be found as part of the cimcp system, they are fully compatible with fimcp.
- Constraint combinations: Models that combine some of the above mentioned constraints. We included some, but the possibilities are endless!
- Modeling variants: One constraint can be modeled in different ways, and we also present some alternative modelings.
Every model has a link to its source file, ignore the rest of the Doxygen documentation, as it does not capture the structure that most of the models use.
Standard | (source code preview: fimcp_standard.cpp) |
Description: This model finds standard frequent patterns (eg. having minimal frequency)
Usage example:Specific options for fimcp_standard: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%)
Discriminating | (source code preview: fimcp_discriminating.cpp) |
Description: This model finds discriminating frequent patterns (frequent on pos, infrequent on neg)
Usage example:Specific options for fimcp_discriminating: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -infreq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%)
Emerging | (source code preview: more/fimcp_emerging.cpp) |
Description: This model finds emerging frequent patterns (emerging from neg to pos and freq on pos)
Usage example:Specific options for fimcp_emerging: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -delta (floating point value) default: 2 delta parameter
Closed | (source code preview: fimcp_closed.cpp) |
Description: This model finds closed frequent patterns (no pattern has a superset with the same frequency)
Usage example:Specific options for fimcp_closed: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%)
Deltaclosed | (source code preview: fimcp_deltaclosed.cpp) |
Description: This model finds delta-closed frequent patterns (no pattern has a superset with a frequency higher then `delta` times its frequency)
Usage example:Specific options for fimcp_deltaclosed: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -delta (floating point value) default: 0.8 delta parameter
Maximal | (source code preview: fimcp_maximal.cpp) |
Description: This model finds maximal frequent patterns (no pattern has a superset that is frequent)
Usage example:Specific options for fimcp_maximal: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%)
Size | (source code preview: fimcp_size.cpp) |
Description: This model finds standard frequent patterns that satisfy the size constraint
Usage example:Specific options for fimcp_size: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -bound1 (EQ, NQ, LQ, LE, GQ, GR) (unsigned int value) default: GQ bound parameters: eg GQ 0 EQ: EQual NQ: Not Equal LQ: Less or eQual LE: LEss GQ: Greater or eQual GR: GReater
Cost | (source code preview: fimcp_cost.cpp) |
Description: This model finds standard frequent patterns that satisfy the cost constraint
Usage example:Specific options for fimcp_cost: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -attrfile (filename with extention) default: ../data/example.attr filename of attributes to use (any name) -bound1 (EQ, NQ, LQ, LE, GQ, GR) (unsigned int value) default: GQ bound parameters: eg GQ 0 EQ: EQual NQ: Not Equal LQ: Less or eQual LE: LEss GQ: Greater or eQual GR: GReater
Avgcost | (source code preview: fimcp_avgcost.cpp) |
Description: This model finds standard frequent patterns that satisfy the average cost constraint
Usage example:Specific options for fimcp_avgcost: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -attrfile (filename with extention) default: ../data/example.attr filename of attributes to use (any name) -bound1 (EQ, NQ, LQ, LE, GQ, GR) (unsigned int value) default: GQ bound parameters: eg GQ 0 EQ: EQual NQ: Not Equal LQ: Less or eQual LE: LEss GQ: Greater or eQual GR: GReater
Maximal+Closed | (source code preview: more/fimcp_maximal+closed.cpp) |
Description: This model finds maximal frequent patterns (no pattern has a superset that is frequent), it uses the redundant closed constraint
Usage example:Specific options for fimcp_maximal+closed: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%)
Closed+Cost | (source code preview: fimcp_closed+cost.cpp) |
Description: This model finds closed frequent patterns that satisfy the cost constraint (constraints are modelled as if independent)
Usage example:Specific options for fimcp_closed+cost: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -attrfile (filename with extention) default: ../data/example.attr filename of attributes to use (any name) -bound1 (EQ, NQ, LQ, LE, GQ, GR) (unsigned int value) default: GQ bound parameters: eg GQ 0 EQ: EQual NQ: Not Equal LQ: Less or eQual LE: LEss GQ: Greater or eQual GR: GReater
Closed+Cost_Dependent | (source code preview: more/fimcp_closed+cost_dependent.cpp) |
Description: This model finds closed frequent patterns that satisfy the cost constraint constraints are modelled as dependent, this is only true for an anti-monotone cost constraint, eg cost =< 100
Usage example:Specific options for fimcp_closed+cost_dependent: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -attrfile (filename with extention) default: ../data/example.attr filename of attributes to use (any name) -bound1 (EQ, NQ, LQ, LE, GQ, GR) (unsigned int value) default: GQ bound parameters: eg GQ 0 EQ: EQual NQ: Not Equal LQ: Less or eQual LE: LEss GQ: Greater or eQual GR: GReater
Discriminating+Deltaclosed | (source code preview: fimcp_discriminating+deltaclosed.cpp) |
Description: This model finds discriminating delta-closed frequent patterns (frequent on pos, infrequent on neg and delta-closed on pos)
Usage example:Specific options for fimcp_discriminating+deltaclosed: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -infreq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -delta (floating point value) default: 0.8 delta parameter
Discriminating+Deltaclosed+Size_Independent | (source code preview: more/fimcp_discriminating+deltaclosed+size_independent.cpp) |
Description: This model finds discriminating delta-closed frequent patterns of a certain size (frequent on one parition, infrequent on the other and delta-closed on the first partition) constraints are modelled as if independent
Usage example:Specific options for fimcp_discriminating+deltaclosed+size_independent: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -infreq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -delta (floating point value) default: 0.8 delta parameter -bound1 (EQ, NQ, LQ, LE, GQ, GR) (unsigned int value) default: GQ bound parameters: eg GQ 0 EQ: EQual NQ: Not Equal LQ: Less or eQual LE: LEss GQ: Greater or eQual GR: GReater -bound2 (EQ, NQ, LQ, LE, GQ, GR) (unsigned int value) default: LQ bound parameters: eg GQ 0 EQ: EQual NQ: Not Equal LQ: Less or eQual LE: LEss GQ: Greater or eQual GR: GReater
Discriminating+Deltaclosed+Size_Dependent | (source code preview: more/fimcp_discriminating+deltaclosed+size_dependent.cpp) |
Description: This model finds discriminating delta-closed frequent patterns of a certain size (frequent on one parition, infrequent on the other and delta-closed on the first partition) constraints are modelled as dependent, this is only true for an anti-monotone cost constraint, eg cost =< 100
Usage example:Specific options for fimcp_discriminating+deltaclosed+size_dependent: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -infreq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -delta (floating point value) default: 0.8 delta parameter -bound1 (EQ, NQ, LQ, LE, GQ, GR) (unsigned int value) default: GQ bound parameters: eg GQ 0 EQ: EQual NQ: Not Equal LQ: Less or eQual LE: LEss GQ: Greater or eQual GR: GReater -bound2 (EQ, NQ, LQ, LE, GQ, GR) (unsigned int value) default: LQ bound parameters: eg GQ 0 EQ: EQual NQ: Not Equal LQ: Less or eQual LE: LEss GQ: Greater or eQual GR: GReater
Standardnoreif | (source code preview: more/fimcp_standardNoreif.cpp) |
Description: This model finds standard frequent patterns (eg. having minimal frequency) using a non-reified formulation of the frequency constraint (slower)
Usage example:Specific options for fimcp_standardNoreif: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%)
Standardplus | (source code preview: more/fimcp_standardPlus.cpp) |
Description: This model finds standard frequent patterns (eg. having minimal frequency) faster then the normal formulation as this one uses a linear implication constraint implemented to avoid auxiliary variables
Usage example:Specific options for fimcp_standardPlus: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%)
Closedplus | (source code preview: more/fimcp_closedPlus.cpp) |
Description: This model finds closed frequent patterns (no pattern has a superset with the same frequency) faster then the normal formulation as this one uses a linear implication constraint implemented to avoid auxiliary variables
Usage example:Specific options for fimcp_closedPlus: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%)
Costreif | (source code preview: more/fimcp_costReif.cpp) |
Description: This model finds standard frequent patterns that satisfy the cost constraint using a reified formulation of the cost constraint
Usage example:Specific options for fimcp_costReif: -output (none, normal, cpvars) default: normal type of output of solutions none: do not output solutions normal: print solutions (FIMI-style) cpvars: print the CP variables of the solutions) -cclause (unsigned int) default: 1 coverage constraint using clause ? -datafile (filename with extention) default: ../data/example.txt filename of dataset to use (any name) -solfile (filename with extention) default: filename to write solutions to (any name) -freq (floating point value) default: 0.1 frequency (>1 is absolute, <1 is percentage, eg 0.10 is 10%) -attrfile (filename with extention) default: ../data/example.attr filename of attributes to use (any name) -bound1 (EQ, NQ, LQ, LE, GQ, GR) (unsigned int value) default: GQ bound parameters: eg GQ 0 EQ: EQual NQ: Not Equal LQ: Less or eQual LE: LEss GQ: Greater or eQual GR: GReater