**
Extensions to Haskell type system
**

A reader should have realized by now that the current Haskell type system is not the final design. It is a very fine and sophisticated system, but it still has some shortcomings. As Haskell evolves from the stable Haskell 98 specification to (what is a current working name) Haskell 2, more and more features will be added and refined to overcome those shortcomings.

There are already several interesting features, which has been implemented in particular implementations of Haskell compilers and interpreters. For example, HBC compiler has had the existential types for several years. The idea has been so fruitful and useful that other compilers, such as Hugs and GHC, implement it now too -- as an extension to standard Haskell 98 specification. It seems that this concept will be accepted in specification of Haskell 2, and hence we will document it here as well.

*
Warning: The original notation has been adopted to match the Haskell
style. But be aware that current language implementations
do not use exists a notation for existential quantification,
but rather rely on few clever tricks to achieve similar results
shown below. They will be discussed later, but for now exists
will be used for clarity.
*

Type specifications for variables of a universally quantified type have the following form, for any type expression T a:

p :: forall a. T a (e.g. id :: forall a. (a -> a))By analogy with universal quantification, we can try to give meaning to existentially quantified types. In general, for any type expression T a,

p :: exists a. T ahas the property "For some type

`a`

, `p`

has the type
`T a`

".
For example,
(3,4):: exists a. (a, a) (3,4):: exists a. awhere

`a = Int`

in the first case, and
`a = (Int,Int)`

in the second.
[..cut]

Every value has type `exists a. a`

because for every value there exists
a type such that that value has that type.
Thus the type `exists a. a`

denotes the set of all values, which we
shall sometime call `Top`

(the biggest type):

data Top = exists a. a -- the type of any value whatsoeverThe set of all ordered pairs may be denoted by the following existential type.

exists a. exists b. (a, b) -- the type of any pair whatsoeverThis is the type of any pair

`(p,q)`

because, for some
type `a`

(take a type
of `p`

) and some type `b`

(take a type of
`q`

), `(p,q)`

has type `(a,b)`

.
The type of any object together with an integer-valued operation that can be applied to it may be denoted by the following existential type.

exists a. (a, a -> Int)The pair

`(3,succ)`

has this type, if we take
`a = Int`

. Similarly the pair
`([1,2,3],length)`

has this type, if we take
`a = [Int]`

.
Because the set of types includes not only simple types but also
universal types and the type `Top`

, existentially quantified types
have some properties that may at first appear counterintuitive.
The type

exists a. (a, a)is not simply the type of pairs of equal type (e.g.

`(3,4)`

), as one might
expect. In fact even `(3,True)`

has this type. We know
that both `3`

and `True`

have type `Top`

; hence there is a type
`a = Top`

such that
(3, True) :: (a, a).Therefore,

exists a. (a, a)is the type of all pairs whatsoever, and is the same as

exists a. exists b. (a, b).Similarly, any function whatsoever has type

exists a. a -> a,if we take

`a = Top`

.
However,
exists a. (a, a -> Int)forces a relation between the type of an object and the type of an associated integer-valued function. For example,

`(3,length)`

does
not have this type (if we consider `3`

as having type
`Top`

,
then we would have to show that `length`

has type
`Top -> Int`

, but we
only know that
length :: forall a. [a] -> Intmaps any lists to integers, and we cannot assume that any arbitrary object of type

`Top`

will be mapped to integer).
Not all existential types turn out to be useful. For example, if we
have an (unknown) object of type `exists a. a`

,
we have absolutely no way of manipulating it (except passing it around)
because we have no information about it. If we have an (unknown) object
of type `exists a. (a, a)`

, we can assume that it is
a pair and apply `fst`

and `snd`

to it, but
then we are stuck because we have no information
about `a`

.

Existentially typed objects can be useful, however, if they are sufficiently structured. For example,

x :: exists a. (a, a -> Int)provides sufficient structure to allow us to compute with it. We can execute:

(snd x) (fst x)and obtain an integer.

Hence, there are useful existential types which hide some of the structure of the objects they represent but show enough structure to allow manipulations of the objects through operations the objects themselves provide.

These existential types can be used, for example, in forming apparently heterogeneous lists:

[(3,succ), ([1,2,3],length)] :: [exists a. (a, a -> Int)]We can later extract an element of this list and manipulate it, although we may not know which particular element we are using and what its exact type is. Of course, we can also form totally heterogeneous lists of type

`[exists a. a]`

, but these
are quite unusable.
Source: [CARDELLI + WEGNER]

The real usefulness of existential types becomes apparent only when we realize that

exists a. (a, a -> Int)is a simple example of an abstract type packaged with its set of operations. The variable

`a`

is the abstract type itself, which hides
a representation. The representation was `Int`

and
`[Int]`

in the
previous examples. Then
`(a, a ->Int)`

is the set of operators on that abstract type: a constant of type
`a`

and an operator of type `a -> Int`

. These operators are
unnamed, but
we can have a named version by using record types instead of Cartesian
products:
x:: exists a. {const: a, op: a -> Int} x.op(x.const)As we do not know what the representation

`a`

really is (we only know
that there is one), we cannot make assumptions about it, and users
of `x`

will be unable to take advantage of any particular implementation
of `a`

.
An ordinary object `(3,succ)`

may be converted to an
abstract object having type
```
exists a. (a, a->Int)
```

by packaging it so that some of its structure is hidden.

Source: [CARDELLI + WEGNER]

*
Cardelli & Wegner proceed here with the explanation of
the details of packaging. Particularly, ADA packages are
subjected to analysis. We are stopping right here -- delaying this
discussion to the sections describing Haskell implementation
of these ideas.
*

Following other researchers (including [CARDELLI + WEGNER] quoted in our introduction to the topic of existential types), [LAUFER] suggested one possible solution to extending the Haskell type system with the concept of existential quantification. The idea has been first implemented in HBI interpreter, and now existential types are also present (as extention to Haskell 98) in GHC compiler and Hugs interpreter.

Note, however, that - although HBI reserves special syntax for
annotation of existential types (where existential variables are
preceded by "?") - Hugs and GHC implementations do not explicitely
use the construct `exists`

, but instead require
the explicit form of `forall`

quantification. On the
surface, this might seem quite counter-intuitive, until one realizes
that `exists`

and `forall`

quantifiers are
related via De Morgan's rules. The current
experimental implementations of Hugs and GHC existentiality provide,
in effect, isomorphism between universally and existentially
quantified data structures. But I cannot provide any simple
description of the Laufer's concept. Perhaps someone else can.

Whether it was solely chosen for the accomodation
of existential types or for some other reasons, the Hugs and GHC
extensions to Haskell 98 drop the restriction that only type variables
that are bound as formal type parameters may appear in the
component types of algebraic data type definition; that is,
only those that appear on both sides of 'data' declaration
in Haskell. But both extensions still require
the explicit universal quantification `forall`

(as a hint
to compilers?). Too bad - it confuses the issue, I think.

As a consequence, any free type variables in a data type declaration is considered local to and existentially quantified in the component types of the data constructor in which it appears.

Additionally, just as universally quantified type variables can
be constrained by a context `context`

, existentially
quantified type variables can be constrained by local
contexts `c,c',c''...`

for the data constructors
in which they appear. Thus the general form
of a data type declaration (in extensions to Haskell 98) is as follows:

data [context=>] T u1..uk = [c=>]K t1..tk | [c'=>] K' t1'...tk' |....As a simple example of the above, consider the following algebraic data type declaration:

data KEY = forall a. MakeKey a (a -> Int)Notice that the type variable

`a`

does not appear
in the data type itself, which is just plain KEY. The left side
does not contain a context `context`

, nor it depends
on any other type variable. (This is just the simplest example).
But the right hand side depends on the free, universally quantified
variable `a`

. This supposes to signal to us that
this variable is, in fact, existentially quantified.
Whatever the confusion due to lack of sugaring, we can see that in order to make the above definition consistent, the data constructor MakeKey must have the type:

MakeKey :: a -> (a -> Int) -> KEYto produce just plain KEY. The interpretation is such that there are

`a`

and `a ->Int`

that, when acted upon by the data constructor `MakeKey`

will produce a result of the abstract type `KEY`

.
This brings us very close to the
topic of existential types, outlined previously. According
to Cardelli & Wegner the truly existential constructor, acting
on the useful pair of variable `a`

and a function
`a -> Int`

would have to be in a form:
MakeKey :: (exists a . (a, a -> Int)) -> KEYThe two forms are (supposedly) isomorphic; compilers are happy, stylists -- less, but what counts its the usefulness of this approach, which will be demonstrated by the examples that follow.

Source: [LAUFER]

*
A word about a name of this section: a truly heterogeneous
list is not possible. What is usually meant by this is
a list of heterogeneous elements - each wrapped in a common
type before being put in such a list.
*

Consider the following data declaration

data KEY = forall a. MakeKey a (a -> Int)It introduces a new (parameterless) type KEY and a value constructor

`MakeKey`

of type
MakeKey :: a -> (a -> Int) -> KEYSince all applications of MakeKey have the same result type, KEY, we can construct the following list:

hetList = [MakeKey 5 id, MakeKey [1,2,3] length, MakeKey True (\x -> if x then 1 else 0), MakeKey 9 (+1)]Although

`hetList`

is a homogeneous list of type [KEY],
each of its elements have a different representation type a.
We use pattern matching to extract the two components of a value
of type KEY. The type variable `a`

in the component type
of the constructor MakeKey is existentially quantified. Thus the
argument type of `f`

and the type of `x`

in
the following function definition are identical,
giving the function the type `KEY -> Int`

:

whatkey :: KEY -> Int whatkey (MakeKey x f) = f xWe observe that KEY is an abstract data type whose implementations explicitely bundle together a value of some type and a method on that type returning an

`Int`

. Each element of `hetList`

may be viewed as a different implementation of the same abstract
data type. Two implementations of KEY differ either in representation
types, for example the second and third elements, or in methods,
such as the first and last elements of `hetList`

.
Given a value of type KEY, we do not know its representation type, but it is guaranteed that we can safely apply the second component (the method) to the first component (the value).

Compare simplicity of this approach with the traditional approach that does not use existential types.

A datatype whose definition involves existentially quantified variables cannot use the standard Haskell mechanisms for deriving instances of standard classes like Eq and Show. If instances of these classes are required, then they must be provided explicitly by the programmer. For example, you could use Hugs (in -98 mode) to evaluate the following (where 'hetList' was defined in previous example):

length hetList ==> 3 :t hetList ==> [KEY] map whatkey hetList ==> [5, 3, 1, 10]but you could not display:

hetList ==> Hugs will complain until you provide show method for each of four cases.It is possible, however, to attach type class constraints to existentially quantified variables in a datatype definition. For example, we could improve on our 'data' definition from the previous example by constraining the existential type variable

`a`

to instances of the type class `Show`

.
This would mean that all of the operations of the class `Show`

,
are available when a value `a`

of this type
is unpacked during pattern matching. This can be put
to good use to define a simple instance of type class `Show`

for the KEY datatype, describing the way we want to "show" our abstract
KEYs. The entire module, handling this example is
shown below:
module C where data KEY = forall a. Show a => MakeKey a (a->Int) instance Show KEY where showsPrec _ (MakeKey x f) = showString "(" . showsPrec 0 x . showString ", " . showsPrec 0 x . showString " ==> " . showsPrec 0 (f x) . showString ")" -- our fancy way of showing the KEYs hetList = [MakeKey 5 id, MakeKey [1,2,3] length, MakeKey True (\x -> if x then 1 else 0), MakeKey 9 (+1)] whatkey :: KEY -> Int whatkey (MakeKey x f) = f x -- end of C moduleNow when you simply enter

`hetList`

, Hugs will respond with:
[(5, 5 ==> 5),([1,2,3], [1,2,3] ==> 3),(True, True ==> 1),(9, 9 ==> 10)]

**
Traditional heterogeneous lists
**

Let us start with the valid data declaration of a form,
which packs together a variable, and a function operating
on that variable and yielding an integer `Int`

.

data AnyKey a = MakeAny a (a -> Int)Looks promissing, but this would not let us put

`MakeAny 5 id`

and
`MakeAny [1,2,3] length`

on the same list since
they have in fact different types: `AnyKey Int`

and `AnyKey [Int]`

,
respectively. Values of different types cannot be put on the
same list.
The traditional approach to dealing with heterogeneous lists in Haskell is to introduce an algebraic datatype with a separate constructor for each type allowed in the list, as in:

data KEY = MakeIdKey Int (Int->Int) | MakeListKey [Int] ([Int]->Int) | MakeBoolKey Bool (Bool->Int) | MakeSuccKey Int (Int->Int)This is valid, since every option here evaluates to the same data type

`KEY`

.
Having this decleared, we can now create a list of values - all
wrapped in a common type ` KEY `

.

hetList = [MakeIdKey 5 id, MakeListKey [1,2,3] length, MakeBoolKey True (\x -> if x then 1 else 0), MakeSuccKey 9 (+1)]

Simple enough! A more delicate question is to examine an individual component of this list to find out what key does it hold. This can be only done by pattern matching, which cannot be generalized even though the right-hand sides look identical: all you need is to apply the second element (a function) to the first (a variable).

whatkey :: KEY -> Int whatkey (MakeIdKey x f) = f x whatkey (MakeListKey x f) = f x whatkey (MakeBoolKey x f) = f x whatkey (MakeSuccKey x f) = f xThere are several disadvantages to this approach: First, one has to remember and use a number of different constructors. Second, the component types of the algebraic data type are not abstract; consequently, any operation can be applied to the components as long as it is type correct. Finally and most importantly, algebraic types of this form do not support extensions; when a new case is added to an algebraic type, every function that operates on the type has to be changed to include new case.

Compare it with the existential approach to this problem.

To hide some information from a client you can use the traditional Haskell way of restricting imports/exports from Haskell modules, or you can use existential types instead. You can also combine the two - as shown in attached examples.

An example of combined information hiding, which uses both: module mechanism and existential types, is shown below. The module D imports four functions from module C, but not much information is passed here from C.

module D where import C (questions, answers, secret1, secret2) -- End of module DBut we may try to examine it a bit:

:type secret1 ==> secret1 :: Secret Aha, some unknown object! :type secret2 ==> secret2 :: Secret A similar object :type questions ==> questions :: Secret -> ([String], [String]) Takes secret object and provides a pair of two lists of strings. :type answers ==> answers :: Secret -> ([Bool], [String]) Takes secret object and provides a pair with list of booleans and list of strings :info Secret ==> Uknown reference `Secret' Ouch!But the operations on the secret objects look safe. Let's try:

questions secret1 ==> (["has size?","has mass?"],["name?"]) answers secret1 ==> ([False, True], ["Guess!"]) It has no size, it has some mass, but it won't tell us its name. questions secret2 ==> (["has size?"],[]) answers secret2 ==> ([True], []) It has some size, but you even cannot ask about its mass and name.That's all we can do here. Not much!

**
Existential information hiding
**

But module C exports the datatype Secret. So let us import it to module D:

module D where import C (Secret(S), questions, answers, secret1, secret2)Now we can inquire about the type "Secret":

:info SecretHugs will respond with:

-- type constructor data Secret -- constructors: S :: Show a => a -> [a -> (String,Bool)] -> [a -> (String,String)] -> Secret -- instances: instance Show SecretNow the secret is being partially revealed. It looks like data

`Secret`

is of existential type depending on some type variable
`a`

. It provides unknown object `x`

and two lists of functions [f] and [g], operating on the same
object `x`

, but returning different answers:
f :: a->(String, Bool) g :: a->(String, String)We know the pattern and we are provided with constructor

`S`

,
so we may try to create our own dummy little secret:
our_secret = S 5 [f1,f2] [g1] where f1 x = ("has property 1?", True) f2 x = ("has property 2?", False) g1 x = ("name?", "dummy")Now we can even manipulate our own secret. It appears that module C already provided us with convenient ways of disclosing the secret:

questions our_secret ==> (["has property 1?", "has property 2?"], ["name?"]) answers our_secret ==> ([True, False], ["dummy"])But still, we cannot extract our original value

`5`

.
And both secrets imported from the module C are still the mysteries
to us! We were just passed two pieces of data with a set of
operations, we could even write our specialized functions
to replace `questions`

and `answers`

,
we were given some hints by attached textual descriptions,
but the secrets still hold.

So what are the secrets of module C, which could not be broken by module D? Here is the source code of C:

-- We start with list of exports: module C (Secret(S), questions, answers, secret1, secret2) where -- Here is our existential data type data Secret = forall a. (Show a) => S a [a->(String,Bool)] [a->(String,String)] -- We will make our secret an instance of Show class -- just in case users wish to display the secrets. -- We could be nicer than given below, of course! instance Show Secret where show (S x fs gs) = show "Secret" -- Here are our two little secrets secret1 = S Electron [hasSize, hasMass] [name] secret2 = S Ball [hasSize] [] -- Any ordinary object can be converted to an abstract type Secret -- by packaging it in a standard way. We could use it instead -- the above, as in: -- secret1 = pack Electron -- secret2 = pack Ball pack :: (Show a, Object a) => a -> Secret pack x = S x [hasSize, hasMass] [name] -- Here are standard ways to reveal our questions and -- answers. We will export them for user convenience. questions :: Secret -> ([String],[String]) questions (S x fs gs) = (map fst (apply x fs), map fst (apply x gs)) answers :: Secret -> ([Bool],[String]) answers (S x fs gs) = (map snd (apply x fs), map snd (apply x gs)) -- This is an auxiliary function. No need to export it. apply :: a -> [a->b] -> [b] apply x [] = [] apply x (f:fs) = (f x) : (apply x fs) -- We define two secret objects: Electron and Ball -- and for our conveniennce make them the instances -- of a class Object. This way both of our objects -- have the same interface. data Electron = Electron deriving Show data Ball = Ball deriving Show class Object a where hasSize :: a->(String,Bool) hasMass :: a->(String,Bool) hasColor :: a->(String,Bool) name :: a->(String,String) instance Object Electron where hasSize x = ("has size?", False) hasMass x = ("has mass?", True) hasColor x = ("has color?",False) name x = ("name?", "Guess!") instance Object Ball where hasSize x = ("has size?",True) hasMass x = ("has mass?",True) hasColor x = ("has color?", True) name x = ("name?", "Ball")

The concept of overlapping comes from set theory and can be a bit confusing - as demonstrated by the following puzzle:

We are given 87 tibbs. All 34 gibbs and 49 pibbs are tibbs. If exactly 9 tibbs are gibbs and pibbs, then how many tibbs are neither pibbs nor gibbs?

This problem is about overlapping sets. The words "tibbs" and "gibbs" and "pibbs" are nonsense words made up for the problem and used because they are cute. We'll cal them T, G and P.

We are talking about Ts. Everything is a T. Think of T, G, and P as properties or attributes of these things, so things are allowed to be more than one. Perhaps T means things and G means good and P means purple.

So we start with 87 things. They are all Things. 34 of them are Good and 49 of them are Purple. But that doesn't make 83 all together, because there's some overlap. Some of them are both Good AND Purple. In fact, the next sentence tells you that the overlap is exactly 9 things that are both Good and Purple. So we have 25 that are Good but not Purple, 40 that are Purple but not Good, and 9 that are both Good and Purple. That accounts for 25+40+9=74.

And, since we started with 87 Things all together, how many does that leave un-accounted for? 13! These are the Things that are neither Good nor Purple.

Source: Message from Math Forum

Class constraints define a set of admissible instances of a given
class. Taking as an example a simplified declaration of class
`Num`

and of its instances `Int`

and
`Float`

:

class Num a where (==), (/=) :: a -> a -> Bool (+), (-), (*) :: a -> a -> a negate :: a -> a ... instance Num Int where x == y = integer_eq x y x + y = integer_add x y ... instance Num Float where ...we can see here several class constraints, which say:

Num a -- any type a is admissible to class Num -- (if it supports following functions ...) Num Int -- Int is admitted as an instance of Num class Num Float -- Float is also admitted as an instance of Num classHaskell 98 does not allow for overlapping instances; that is, instances whose class constraints overlap. Here

`Num Int`

and `Num Float`

do not overlap: integers and floats
are distinct types.
Some compilers, such as GHC or Hugs, introduce extensions to Haskell 98 standard and allow for some restricted form of overlapping.

As it says in the Hugs manual:

Overlapping instances are allowed, providing that one of the instances in each overlapping pair is strictly more specific than the other.

To understand the above we need to introduce several definitions.

**
Substitution:
**
A mapping from type variable to type expression

**
Overlapping constraints:
**
If `pi`

and `pi'`

are two class constraints,
then we say that `pi`

and `pi'`

are overlapping
if ` S(pi) = S'(pi') `

for
some substitutions `S`

and `S'`

.

For example, `C Int`

and `C [a]`

do not overlap.
But `C (a,Int)`

and `C (Bool, a)`

do overlap:
if you substitute `C (a,Int)`

by `C (Bool,Int)`

and `C (Bool, a)`

by `C (Bool, Int)`

then both substitutions
lead to identical results.

**
More specific class constraint:
**
Class constraint `pi`

is more specific than `pi'`

if `S(pi) = pi'`

for some substitution `S`

.
Let us write this as `pi <= pi' `

. For example:

C (Bool, Int) <= C (a, Int) <= C (a, b) <= C a.

`pi`

to be strictly more specific than
`pi'`

, written `pi < pi'`

,
we require that ` pi <= pi' `

, but that
` not (pi' <= pi) `

. For example, all
the inequalities above are strict:
C (Bool, Int) < C (a, Int) < C (a, b) < C a.For an example of class constraints that satisfy

`pi <= pi'`

but where `pi`

is not strictly more specific that
`pi'`

, note that:
C (a,b) <= C (c,d) <= C (a,b).Thus a program containing three following instance declarations:

class C a where .. instance C (a, b) where .. instance C (a, Int) where .. instance C (Bool, Int) where ..is acceptable since all class constraints are strictly ordered:

C (Bool, Int) < C (a, Int) < C (a, b) < C a.The strict inequality gives us a simple rule for choosing which instance to use in any given situation without ambiguity: use whichever is the closest match.

A pair of overlapping instances in which neither is more specific than
the other leaves an **ambiguity**: for example, given two
instances `C (a,Int)`

and `C (Bool,a)`

, which
should we choose to deal with an instance of the
form `C (Bool,Int)`

? Either alternative could apply,
with potentially different results depending on the definitions
in each instance.

One way to allow overlapping pairs like this **without ambiguity**,
would be to require that the programmer provide a further instance
declaration to cover precisely the overlapping cases. For the time
being, Hugs/GHC implementations take the simpler step of banning such
overlaps altogether.

Source:
Message from Mark P Jones to Haskell list
*
*

*
Excerpt from Hugs 98 manual follows:
*

The command line option +o can be used to enable support for overlapping instance declarations, provided that one of each overlapping pair is strictly more specific than the other.

This facility has been introduced in a way that does not compromise the coherence of the type system. However, its semantics differs slightly from the semantics of overlapping instances in Gofer, so users may sometimes be surprised with the results. This is why we have decided to allow this feature to be turned on or off by a command line option (the default is off). If practical experience with overlapping instances is positive then we may change the current default, or even remove the option.