We are continuously looking for PhD students on several research topics including but not limited to:
- Data Mining, Machine Learning and Constraint Programming
as advertised on KDNuggets, the Pascal network and the CP newsletter
We currently (on Jan 1, 2012) have three open positions (at the Ph.d. or post-doc level) in the area of data mining, and its combination with constraint satisfaction and constraint programming.
This is in the context of two projects funded by the Research Foundation -- Flanders and the European FET Project ICON (Inductive Constraint Programming, 2012-2014, coordinated by KULeuven, in collaboration with Cork Constraint Computation Centre, University of Pisa, and CNRS-LIRMM Montpellier); these are projects that run for 3 to 4 years.
The successful applicants will work under the direction of Prof. Luc De Raedt and Dr. Siegfried Nijssen. The two projects funded by Research Foundation Flanders are in collaboration with Prof. Bart Goethals (University of Antwerp).
The research questions that we are trying to tackle in these projects are related to:
- how to combine / integrate constraint programming with data mining ? (cf. Dagstuhl Seminar on Constraint programming meets Data Mining and Machine Learning)
- how to learn constraints from data ?
- how to adapt constraint programming solvers for use in data mining ? (cf. Guns, Nijssen, De Raedt, Artifcial Intelligence, 2011 and http://dtai.cs.kuleuven.be/CP4IM/)
- how to mine for pattern sets ? (pattern set mining wants to discover global models, whereas pattern mining looks for local regularities)
- how to mine for patters interactively and instantly ?
The Katholieke Universiteit Leuven is an equal opportunity employer.
- Learning in large graphs
In large graphs and networks, nodes are not independent due to the links between them. Therefore, classical learning theory does not apply. Moreover, due to the network structure and size of the data, constructing algorithms with tractable computational cost is challenging. In this project, we will explore new theory and algorithms for learning from network structured data.
- Learning from data originating from evolution
When performing statistics and machine learning in bio-medical application domains, an important challenge is the fact that individuals are not independent but originate from common ancestors. Therefore, in many cases training set and test set will not be drawn identically and independently (i.i.d.). In this project, we study evolutionary populations from a machine learning point of view, and develop new learning models taking evolutionary dependencies into account.
Example publication: http://www.cs.kuleuven.ac.be/cgi-bin-dtai/publ_info.pl?id=42687
- Decision support in intensive care medicine
This project aims at improving decision support in intensive care by relational data mining and probabilistic reasoning and planning methods, on which the DTAI group has extensive expertise.
Example publication: https://lirias.kuleuven.be/handle/123456789/124512
- Learning the Structure of Probabilistic Logical Models
In this project we will explore different algorithms for learning a statistical model from a relational database. We are particularly interested in exploring efficient learning techniques as well as applying the developed algorithms to real-world biological and medical databases.
Example publication: http://pages.cs.wisc.edu/~jdavis/davis.pdf
- Graphical Model Structure Learning
It is also possible to explore learning the structure of a propositional model such as a Bayesian network or Markov network.
Example publication: http://pages.cs.wisc.edu/~jdavis/davisICML10.pdf
- Efficient Inference for Probabilistic Logical Models
Probabilistic logical languages provide powerful formalisms for knowledge representation and learning. Yet performing inference in these languages is extremely costly. In this project will explore various techniques for improving the efficiency of inference in these languages.
Example publication: http://www.cs.washington.edu/homes/pedrod/papers/aaai08a.pdf
- Learning from processes in relational domain
Processes in the real world are often very complex, and the factors influencing individual actions are only accessible through several connected relations. Also, related actions can not be considered independent training examples, and learning the rules a process must satisfy are hard to learn directly. In this process we want to research interactive models for learning verifiable process rules.
- Probabilistic logical models for computational biolog
There is a huge literature on modeling individual relations or processes in biology. Still, only little work successfully combines information from different scales (genomics, transcriptomics, proteomics, ...) in a statistically clear and accurate way. In this project we want to apply probabilistic logical models to construct multi-scale models for computational biology.
- Flexible high-throughput data mining
Many sciences, including biology, are becoming increasingly data-rich. Massive amounts of data are generated in high-throughput processes, making the use of advanced data analysis tools possible and necessary. In this project, we aim at developing a platform for high-throughput analysis of images generated by time-lapse microscopy. This involves the transformation of video data into a structured format that allows efficient storage and data mining, and the development of flexible, query-oriented, data mining methods for the resulting database.
Project: Elaboration of the CellPhInDER platform
If you are interested in any of these topics, please check our page about how to apply.
Not ready for a job yet, but attracted to Machine Learning? Maybe one of our master thesis topics might interest you...