My implementation of the Hindley-Milner algorithm was heavily based on the
*Cardelli* paper "Basic Polymorphic Typechecking"; if you find
that the code structure is similar to the code in his Appendix, don't be
surprised. Having found a bug in the code, I feel reasonably confident
that I understand the algorithm well enough to be able to reproduce it,
at least to a first order.

The implementation consists of a Perl program that implements type
inference on Abstract Syntax Trees. I felt that implementing a parser
for the sample language was unnecessary, seeing is it's a task
orthogonal to type inference. The language processed by the program is
a subset of the one defined in the *Cardelli* paper. I only
support single variable `let`, and
`letrec` is an explicit statement. Also, the conditional
construct is not part of the language, but instead defined using via
the `cond` function. Our belief is that the language has
equivalent expressiveness. The following grammar is used:

Exp::=Identifier|ExpExp| [function application] funIdentifierExp| letIdentifier=ExpinExp| letrecIdentifier=ExpinExp| (Exp)

The main function, `tryexp`, takes an AST and prints either
the inferred type of the expression, or an error.

The ast package has constructors for AST expressions. It has one
constructor for each of the productions in the above grammar; with the
exception of the "(" one, since it doesn't correspond to an AST node.
It also has a `print` routine which recursively prints out a
fully parenthesized expression corresponding to the AST

This package has constructors for new types. The basic type classes are variables, and operators (composite types); the latter take zero or more types as arguments. In general, the type system uses the following types:

type::=Identifier| [type variable]Operatortype... |Operator|

Commonly defined operators are "->" and "X" (for pair). The atomic
types, such as `int` and `bool` are represented as
operators with no arguments (a la *Cardelli*).

The `new_var` procedure defines a new type variable. Due to
Perl's "magical" ++ operator, there is little worry of ever running out
of names.

A type variable may have an "instance" field, which points to the
The `print` procedure recursively prints a fully parenthesized
type expression, following the instance pointers to get to the best
currently known type.

The functions mirror Cardelli's paper. I won't reiterate the
comments describing each function. `analyze` and `unify`
perhaps formulate the interesting component of this work.
`unify` has a "fix" from the *Cardelli* version: a
generic variable won't be unified with a non-generic variable (instead,
the reverse unification is performed). This allows us to "taint" the
generic variable, and partially implement the rule that:

In unifying a non-generic variable to a term, all the type variables contained in that term become non-generic.

It is my belief that this rule isn't fully implemented even in my code, but the "fix" enforces this rule for all the examples that I've tried (which was not the case in Cardelli's code.

For convenience `gettype` function "knows" about integer literals and
assigns them all the type `int`.

The main body of code sets up the environment with some basic
functions. It then constructs several examples and calls
`tryexp` on them. The reader is encouraged to construct his own
examples to verify the behavior of the algorithm.

... is available here

A fun addition to this algorithm would be to implement polymorphic references using the construct of imperative and applicative type variables. Another important task is to ensure that the non-generic variable rule (above) is actually fully enforced by this program (my current belief is that it still breaks sometimes). Implementing a parser, especially one with some syntactic sugar added, would simplify testing quite a bit.

Other than the algorithm, an interesting item of knowledge discerned
from my investigation is the dual view of type inference as either a way to
solve a system of type constraint equations **or** proving
theorems using derivations defined by the inference system. This
connection makes the way
the PCC system uses a type system to represent and check proofs more
natural.

Nikita Borisov