Modeling Music in PRISM

Jon Sneyers
K.U.Leuven
Belgium

Editor: Agostino Dovier



Download: PDF

Introduction

Music composers are bound to certain - mostly unwritten - rules, defining the musical genre of their work. In the paper [7] we have constructed a general model that captures such musical rules.

Many formalisms have been proposed to express strict, logical rules. However, most musical genres are hard or impossible to describe as a set of strict rules. It seems to be inherent to music to be somewhat `random' and `organic'. Because of this inherent randomness, we have represented musical rules as probabilistic experiments with some unknown probability distribution. We used a variant of Hidden Markov Models [1], for which the probability distributions can be computed in an automated way from a set of examples.

PRISM.

Probabilistic-logical programming is an extension of logic programming which allows programmers to express both statistical and relational knowledge in a natural way. PRISM [4,5] is a probabilistic-logical programming language based on Prolog. We will use this formalism to implement a general model for polyphonic music. Its probability parameters are automatically learned by running the built-in EM-algorithm of PRISM on training examples.

Modeling of music.

Traditionally, two types of models for music are distinguished. Synthetic models are used to generate music (automatic composition). Analytic models are designed to analyze (e.g. to classify) music. Conklin [2] pointed out that a general model can be applied to both tasks: a statistical analytic model can be sampled to generate music. In [7], we show how our model can be used as a classifier for music that guesses the composer of unknown fragments of music, while it can also be used to automatically compose new music.

Implementation

Music Representation.

For simplicity, we will consider only three aspects of music: melody (note pitch and octave), rhythm (note duration) and polyphony (different voices sounding together). To represent a music score in a form which is suitable for both Prolog-like environments and Markov-like models, we use a simple representation called Interwoven Voices List (IVL). This representation consists of a sequence (list) of phases, which are triples of durations (integer numbers) and two lists: the second list contains the note pitch (encoded as a number) for every voice, the first list indicates for every voice whether or not the note is new. The notes in Figure 1 are encoded as:
           [ (16,[new,new],[55,r]), (16,[old,new],[55,36]), 
(16,[new,new],[60,38]), (16,[old,new],[60,40]),
(16,[new,new],[59,41]), (16,[old,new],[59,38]),
(16,[new,new],[60,40]), (16,[old,new],[60,36]),
(32,[new,new],[62,43]), (32,[old,new],[62,31]) ]

Figure 1: The Interwoven Voice List (IVL) notation.
Figure 2: Nested HMM structure

Modeling IVL-music in PRISM.

We use a nested Hidden Markov Model, based on the intuition of global and local music structure. The global structure of a piece of music is captured by transitions between various Song States (SS). Within every such song state, there is one HMM for each voice, which captures the local structure for that particular voice by transitions between various Voice States (VS). This principle is illustrate in Figure 2. The implementation of a nested HMM in PRISM is discussed in detail in [7].

Experimental results


Classification.

Pollastri and Simoncelli [3] have used HMMs for classification of melodies by composer. In a similar way, we have used our model to classify fragments of polyphonic music. To keep things simple, we have considered only two composers: Bach and Mozart.

Using the built-in EM algorithm of PRISM, we train two instances of the model, $M_B$ and $M_M$, the former using fragments from works of Bach, the latter using fragments of Mozart. To classify a new, unknown fragment we compute the probabilities of it being the output of $M_B$ or $M_M$. The fragment is classified as being work of the composer for which the probability is highest.

For training $M_B$, 72 fragments from Bachs Inventions (BWV 772,773,775,779) were used. We used 50 fragments from Mozarts duets for horn (KV 487, numbers 1, 2, 5, 8, 9, 10 and 12) to train $M_M$. All training examples have two voices. We then classified 30 other fragments: 10 fragments with two voices (5 from Bach, BMV 784, and 5 from Mozart, KV 487 number 7) and 20 fragments with 3 voices (10 from Bach, BMV 798, and 10 from Mozart, KV 229). For all 30 fragments, the classification was correct.

Music generation.

We have also used our model to generate music: first we have trained the model using one of the training sets, then we sample the model using the sample/1 built-in of PRISM. Two typical examples of the resulting output are given in Figure 3, and can be downloaded at [6].

Although the output does not even come close to a human-composed piece, it seems to contain musical style elements of the training examples. It cannot directly be used to generate acceptable full songs, but it might be an interesting source of inspiration for human composers.

We expect that using more refined models (e.g. with more HMM states and adding the concept of time signature to avoid excessive syncopation) and using larger sets of training examples, the style of the generated music would be increasingly similar to the style of the original training examples.

Figure 3: Sample of the model trained with fragments of Bach and Mozart
[Trained with fragments of Bach]
\includegraphics[width=0.98\textwidth]{bach}

[Trained with fragments of Mozart]
\includegraphics[width=0.98\textwidth]{mozart}

Conclusion and future work

We have presented a simple, yet general model for music. It can handle any number of voices. The probability parameters can automatically be adjusted for any musical genre given sufficiently many training examples of the genre. Our classification method seems to be a promising given the outcome of the experiments. Experiments on a larger scale, involving more composers and larger datasets, have to be performed to get an accurate idea of the scope of its practical applicability. The main contribution of [7] lies in demonstrating the feasibility of using probabilistic-logical programming as an elegant tool for developing applications in the computer music domain.

Bibliography

1
Yoshua Bengio.
Markovian models for sequential data.
Neural Computing Surveys, 2:129-162, 1999.

2
Darell Conklin.
Music Generation from Statistical Models.
In Proceedings of the AISB 2003 Symposium on Artificial Intelligence and Creativity in the Arts and Sciences, pages 30-35, Aberystwyth, Wales, 2003.

3
Emanuele Pollastri and Giuliano Simoncelli.
Classification of Melodies by Composer with Hidden Markov Models.
In Proceedings of the First International Conference on WEB Delivering of Music, pages 88-95, Firenze, Italy, 2001.

4
Taisuke Sato and Yoshitaka Kameya.
PRISM: A symbolic-statistical modeling language.
In Proceedings of the 15th International Joint Conference on Artificial Intelligence, pages 1330-1335, Nagoya, Japan, 1997.

5
Taisuke Sato and Yoshitaka Kameya.
Parameter learning of logic programs for symbolic-statistical modeling.
Journal of Artificial Intelligence Research (JAIR), 15:391-454, 2001.

6
Jon Sneyers.
MIDI files of automatically generated music.
Available at http://www.cs.kuleuven.be/~jon/automatic_composition/.

7
Jon Sneyers, Joost Vennekens, and Danny De Schreye.
Probabilistic-logical modeling of music.
In Proceedings of the 8th International Symposium on Practical Aspects of Declarative Languages (PADL'06), pages 60-72, Charleston, South Carolina, January 2006.