Prev Next Up Home Keys Figs Search New

Mercury Type System

Appeared in Volume 9/2, May 1996

Keywords: mercury, typing.

[This news thread appeared in the mercury-users mailing list. Contact Fergus Henderson (fjh@cs.mu.oz.au) for details on joining. My thanks to Mike Lawley for passing it to me. Ed.]


michael.lawley@cit.gu.edu.au
Michael Lawley
15th February 1996

I have a quick question about the type system. What I would like to be able to do is define two types:

:- type foo_a ---> foo(string).
:- type foo_b ---> bar(string).

then define a type which is the union of these, something like:

:- type foo == (foo_a ; foo_b).

This is so I can write code like:

:- pred p_spec(foo_a::in, list(foo)::in) is det.
:- pred p_gen(list(foo)::in) is det.

p_spec(FOO_A, FOOs) :-
    p_gen([FOO_A |FOOs]).

I can't work out how to do this and suspect it isn't possible.


fjh@cs.mu.oz.au
Fergus Henderson
15th February 1996

This feature, undiscriminated unions, has definitely been considered. The main reason why this feature is not part of Mercury is that it can't be implemented, but it turns out that the feature is not really needed anyway.

The obvious ways of implementing undiscriminated unions would cause type checking to take exponential time in the common case, which is not tolerable. Finding an efficient algorithm to infer the types of variables within a clause in a type/mode system that includes parametric polymorphism, that allows subtypes or undiscriminated unions, and that allows more than one mode per predicate, is an as yet unsolved - and I suspect perhaps unsolvable - research problem.

Gert Smolka did his PhD thesis on a type system that combined parametric polymorphism with support for subtypes, and implemented his ideas in a little language called TEL. Unfortunately in the search for a feasible algorithm for type inference of variables within a clause, he was forced to choose one that actually inferred the wrong type in certain situations. If I remember correctly, for a goal such as 'X = [FOO_A|FOOs]', it would infer the type 'list(foo_a)' for X, not 'list(foo)'; as a result, the unification would fail at runtime if FOOs contained any 'foo_b' elements.

Karl Stroetmann and some other people working at Siemens on a system called PAN have recently published work on a new approach to the problem which allows undiscriminated unions and infers types of variables in a reasonable amount of time. However, their approach only allows a single mode per predicate. Disallowing multiple modes in order to allow undiscriminated unions would be throwing the baby out with the bath water.

Anyhow, as I said at the start, it turns out that the feature is not really needed, because it is always possible to use Mercury's discriminated unions instead:

:- type foo ---> a(foo_a) ; b(foo_b).

:- pred p_spec(foo_a::in, list(foo)::in) is det.

p_spec(FOO_A, FOOs) :-
   p_gen([a(FOO_A) | Foos]).

:- pred p_gen(list(foo)::in) is det.

Undiscriminated unions are an essential feature if you are trying to retro-fit a type checker onto existing Prolog code without modifying that code. But if you are willing to modify your code slightly, or if you are writing code from scratch, they are not needed.

One interesting point is that in a Prolog context, people are often unwilling to modify their code in this way because it would make it less efficient. However, because Mercury optimizes the representation of each data type (which is overall a big win), the trade-offs are different in Mercury, and indeed the code has about the same efficiency as your original code would have if Mercury supported subtypes.


michael.lawley@cit.gu.edu.au
Michael Lawley
15th February 1996

Thanks for the detailed reply Fergus. I still have a couple of questions though. Since, in my example, foo_a and foo_b have different functors (foo and bar resp.), I'm not clear why an undiscriminated union results. Is it related to the fact that structurally equivalent types are treated as being different, and/or is it the interaction of modes and types?

Fergus writes:
:- type foo --->a(foo_a) ; b(foo_b).
:- pred p_spec(foo_a::in, list(foo)::in) is det.
p_spec(FOO_A, FOOs) :-
p_gen([a(FOO_A) | Foos]).
:- pred p_gen(list(foo)::in) is det.

An alternative solution is the following:

:- type foo ---> foo(string) ; bar(string).
:- pred p_spec(foo_a::in, list(foo)::in) is det.

p_spec(foo(FOO_A), FOOs) :- 
    p_gen([foo(FOO_A) | FOOs]).

I find this much nicer since explicitly deconstructing the foo_a type, then reconstructing a foo type avoids having to wrap all the occurrences of foos with a/1, b/1.


fjh@cs.mu.oz.au
Fergus Henderson
15th February 1996

Michael Lawley wrote:
Since, in my example, foo_a and foo_b have different functors (foo and bar resp.), I'm not clear why an undiscriminated union results.

I guess this is just a matter of terminology. Undiscriminated unions aren't the real problem - what causes the problem for type inference is the presence of subtyping. If your declaration:

:- type foo == foo_a ; foo_b.

were allowed, then for each unification such as 'X = bar(_)', the type of 'X' may be either 'foo_b' or 'foo'; inferring which of these it actually is turns out to be expensive.

Lawley continues:
Is it related to the fact that structurally equivalent types are treated as being different, and/or is it the interaction of modes and types?

The difficulty is partly related to the fact that Mercury specializes the representation for each type. For example, the functors 'foo/1' of type 'foo_a' and 'bar/1' of type 'foo_b' have different types, so they may both be represented by the tag zero. Conversion of these functors from type 'foo_a' or 'foo_b' to the hypothetical union type 'foo' must necessarily involve a representation change, because otherwise both functors have the same representation. This means that when the Mercury compiler generates code for 'X = bar(_)', it would need to know the type of 'X' so that it can construct the appropriate representation of the functor.

But the difficulty is mainly related to the interaction between modes and types. If you allow subtypes, then mode checking and type checking become intertwined - you need a "directed" type system. This is because types are covariant on input arguments but contravariant on output arguments. Suppose 'number' is defined as the union of 'int' and 'float'. For input arguments, passing an int where a number is expected is fine, but passing a number (e.g. a float) where an int is expected is not OK. For output arguments, receiving a number (e.g. a float) where an int is expected is not OK, but receiving an int where a number is expected is fine.

Lawley continues:
:- type foo ---> foo(string) ; bar(string).
:- pred p_spec(foo_a::in, list(foo)::in) is det.
p_spec(foo(FOO_A), FOOs) :- p_gen([foo(FOO_A) | FOOs]).


I find this much nicer since explicitly deconstructing the foo_a type, then reconstructing a foo type avoids having to wrap all the occurrences of foos with a/1, b/1.

Yes, this uses Mercury's overloading of functors to achieve a similar effect to subtyping. One thing to be careful of with this approach is that in the worst case, overloading of functors can lead to the same performance problem during type checking as subtypes would cause. If you do write code for which the compiler has trouble resolving ambiguous overloading, it will issue a warning:

foo.m:123: In clause for predicate 'foo/2':
foo.m:123: warning: highly ambiguous overloading.
foo.m:123: This may cause type-checking to be very slow.
foo.m:123: It may also make your code difficult to understand.

However, I have yet to hear of anyone actually encountering this warning in real code ;-)

A little-known fact is that Mercury's mode system is in fact expressive enough that you can use the mode system as a directed type system! So another approach is the following:

:- type foo ---> foo(string) ; bar(string).
:- inst foo = bound( foo(ground) ; bar(ground) ).
:- inst foo_a = bound( foo(ground) ).
:- inst foo_b = bound( bar(ground) ).

:- pred p_spec(foo::in(foo_a), list(foo)::in) is det.

p_spec(FOO_A, FOOs) :- p_gen([FOO_A | FOOs]).

Here the type of the first argument of 'p_spec' is just 'foo', not 'foo_a', so you wouldn't get a type error if you pass say 'bar("")'. Instead, the subtype restriction is stated in the mode, 'in(foo_a)', so you would get a mode error.

Prev Next Up Home Keys Figs Search New