Publications:
The people behind this project are:
Nectar paper: Constraint Programming for Data Mining and Machine Learning (AAAI 2010)
A 'New Scientific and Technical Advances in Research (Nectar)' paper summarising our work up to now.
- Abstract
- Paper: cp4im_aaai10_nectar.pdf
- Slides: slides_aaai10_nectar.pdf
- Bibtex:
@inproceedings{cp4im_aaai10_nectar, author = {Luc {De Raedt} and Tias Guns and Siegfried Nijssen}, title = {Constraint Programming for Data Mining and Machine Learning}, booktitle = {AAAI}, year = {2010}, pages = {1671-1675}, }
Machine learning and data mining have become aware that using constraints when learning patterns and rules can be very useful. To this end, a large number of special purpose systems and techniques have been developed for solving such constraint-based mining and learning problems. These techniques have, so far, been developed independently of the general purpose tools and principles of constraint programming known within the field of artificial intelligence. This paper shows that off-the-shelf constraint programming techniques can be applied to various pattern mining and rule learning problems (cf. also (De Raedt, Guns, and Nijssen 2008; Nijssen, Guns, and De Raedt 2009)). This does not only lead to methodologies that are more general and flexible, but also provides new insights into the underlying mining problems that allow us to improve the state-of-the-art in data mining. Such a combination of constraint programming and data mining raises a number of interesting new questions and challenges.
Seminar: Constraint Programming for Itemset Mining (AOP 2009)
A seminar by Tias Guns on the 'Analysis of Patterns' summer school.
- Video: on videolectures.net
Paper: Correlated Itemset Mining in ROC Space: A Constraint Programming Approach (KDD 2009)
This paper uses principles of CP to explain and significantly improve the pruning mechanisms for discriminative itemset mining.
- Abstract
- Paper: cp4im_corr_kdd2009.pdf
- Video: on videolectures.net
- Bibtex:
@inproceedings{cp4im_corr_kdd2009, author = {Siegfried Nijssen and Tias Guns and Luc {De Raedt}}, title = {Correlated itemset mining in ROC space: a constraint programming approach}, booktitle = {KDD}, year = {2009}, pages = {647-656}, ee = {http://doi.acm.org/10.1145/1557019.1557092}, }
Correlated or discriminative pattern mining is concerned with finding the highest scoring patterns w.r.t. a correla- tion measure (such as information gain). By reinterpreting correlation measures in ROC space and formulating corre- lated itemset mining as a constraint programming problem, we obtain new theoretical insights with practical benefits. More specifically, we contribute 1) an improved bound for correlated itemset miners, 2) a novel iterative pruning al- gorithm to exploit the bound, and 3) an adaptation of this algorithm to mine all itemsets on the convex hull in ROC space. The algorithm does not depend on a minimal fre- quency threshold and is shown to outperform several alter- native approaches by orders of magnitude, both in runtime and in memory requirements.
Invited talk: Constraint Programming for Data Mining (EGC 2009)
An invited talk by Prof. De Raedt on this work.
- Video: on canalc2.tv (skip the french introduction, the talk itself is in english)
Paper: Constraint Programming for Itemset Mining (KDD 2008)
This is the original paper introducing how to do Itemset Mining using Constraint Programming.
- Abstract
- Paper: cp4im_kdd2008.pdf
- Bibtex:
@inproceedings{cp4im_kdd2008, author = {Luc {De Raedt} and Tias Guns and Siegfried Nijssen}, title = {Constraint programming for itemset mining}, booktitle = {KDD}, year = {2008}, pages = {204-212}, ee = {http://doi.acm.org/10.1145/1401890.1401919}, }
The relationship between constraint-based mining and con- straint programming is explored by showing how the typical constraints used in pattern mining can be formulated for use in constraint programming environments. The resulting framework is suprisingly flexible and allows us to combine a wide range of mining constraints in different ways. We im- plement this approach in off-the-shelf constraint program- ming systems and evaluate it empirically. The results show that the approach is not only very expressive, but also works well on complex benchmark problems.