talideon.com

All that you have lived before, you will live again

March 18, 2008 at 1:27PM Messing with Standard ML and Moscow ML, part one: The core language

I was playing with Moscow ML because I’ve wanted to give Standard ML a bash for a while now, but I could never get SML/NJ to play nice for me back when I tried it first on Windows. That, and there was no documentation download.

So why Moscow ML and not SML/NJ, or Mlton, or one of the other proper Standard ML compilers? Well, even though Moscow ML is getting on a bit and uses bytecode compilation rather than native compilation, it’s simple to use, is well-documented, and, unlike Mlton, doesn’t require FreeBSD 7.0 and much as I’d like to, I haven’t upgraded yet.

It also helps that I’d already fetched it down ages ago, so the source was already in my distfiles folder. [smile]

It was dead easy to get something compiled and running. Here’s the “Hello, World” program:

(* helloworld.sml *)
val _ =
  print "Hello, world!\n";

(* and *) mark the beginning and end of comment blocks. They can be nested, so...

(* This (* is completely legal and *) makes commenting out code easy. *)

I made a really dumb mistake the first time I tried this. I’d been playing with the mosml console and forgot that in actual programs the results of expressions need to be assigned somewhere. So, my first program was like this:

(* brokenhello.sml *)
print "Hello, world!\n";

Which got me this:

% mosmlc -standalone -o hello brokenhello.sml
File "brokenhello.sml", line 2, characters 0-5:
! print "Hello, world!";
! ^^^^^
! Syntax error.

When I saw this, I stared at the screen trying to remember what I’d missed. I felt like a right dolt when I remembered.

I then rewrote the program to use the TextIO module:

(* helloworld-mod.sml *)
val _ =
  TextIO.output (TextIO.stdOut, "Hello, world!\n");

This showed that I could access modules just fine.

You’re probably wondering what val ??? = means. That’s declaration that the value of such-and-such a variable (given by ???) has the value of the expression that follows. In this case, the wildcard variable _ is being bound to the result of evaluating the “Hello, World” program.

Next I tried something a little more substantial: factorials.

(* fac1.sml *)
fun fac n = if n = 0 then 1 else n * fac (n - 1);

val _ = print (Int.toString (fac 5) ^ "\n");

Which wrote out 120, just as I’d expected. As you might guess, Standard ML’s if-expressions are just like a more readable version of C’s trinary operator.

A quick word on functions. Standard ML functions really only take one argument. To be able to take more than one, you need to either pass everything in an n-tuple or use curried function, though naturally because tuples are just another kind of value, you can mix and match both methods. The TextIO.output function above is an example of using a tuple to supply multiple arguments.

A curried functions is one that use individual one-argument functions to consume each argument. Curried functions are useful in that they allow one to partially apply functions and apply them in interesting ways. For instance, what if we wrapped TextIO.output as follows:

fun sayToStream str s = TextIO.output (str, s);

Here we’ve a two-argument curried function called sayToStream, that take a stream, str, and a string to output, s. Evaluating this function in the mosml REPL says the function has this type:

val sayToStream = fn : outstream -> string -> unit

The arrow (->) can be thought of as meaning ‘evaluates to’, so sayToStream is a function that takes an outstream and evaluates to a function that takes a string, which evaluates to unit, Standard ML’s rough equivalent of void in C-derived lanuages, but here, unit equates to a proper value rather than some notional one. You see, our above declaration of sayToStream is just a shorter, less noisy, way of saying:

val sayToStream = fn str => fn s => TextIO.output (str, s);

Where fn arg => expr defines a lambda function evaluating to expr. fn is pronounced ‘lambda’, I’m told, which would make sense if they’d used something that at least looked vaguely like a lambda such as a backslash like Haskell uses, but there you go.

We can partially evaluate sayToStream to create a function that writes its argument to standard output. Here’s what we’d get typing at the mosml REPL (- is the prompt, by the way, and > precedes the result of the preceding expression):

- val sayToStdOut = sayToStream TextIO.stdOut;
> val sayToStdOut = fn : string -> unit

So now we have sayToStdOut, a function that does just what we wanted. It’s pretty much equivalent to the toplevel print function.

Where this becomes really useful is when you want to pass the partially applied function to, say, a mapping function, or if you want to compose it with other functions. Here’s an example of the former, writing out the contents of a string array:

- List.map (sayToStream TextIO.stdOut) ["Each ", "word ", "is ", "an ", "element.\n"];
Each word is an element.
> val it = [(), (), (), (), ()] : unit list

An here’s an example of the latter, where we compose a function called shout and map it to the same array:

- val shout = (sayToStream TextIO.stdOut) o (String.map Char.toUpper);
> val shout = fn : string -> unit
- shout "hello\n";
HELLO
> val it = () : unit
- List.map shout ["Each ", "word ", "is ", "an ", "element.\n"];
EACH WORD IS AN ELEMENT.
> val it = [(), (), (), (), ()] : unit list

o is the functional composition operator and glues two functions together. The operand order is backwards when compared with how the same operator works in maths. In maths, h = g o f would mean h(x) = f(g(x)), but in Standard ML, h = g o f means h x = g (f x), which is completely arseways. Oh, well...

Another example of the nifty things curried functions allow you to do would be O’Caml’s typesafe Printf.printf function. It takes a formatting string and returns functions that accept arguments of the correct types for each placeholder in the formatting string. The consequence of this is that the kind of exploits printf() and company are used for in C aren’t possible in O’Caml. It’s quite possible to do the same thing in Standard ML.

Keep in mind that functions are value too. That’s the reason I’ve used brackets where I have. I’ve used them where there’s an expression I want to evaluate before passing its result as an argument. fac n - 1 means something quite different from fac (n - 1).

Back to factorials. Now to try the same function, but this time using pattern matching instead:

(* fac2.sml *)
fun fac 0 = 1
  | fac n = n * fac (n - 1);

val _ = print (Int.toString (fac 5) ^ "\n");

Again, this gave me the same answer as my original fac function: 120.

Pattern matching is pretty useful. It can simplify code quite a bit by separating out the various cases of a function. Rather than having lots of conditional logic, we just make statements about what the results of evaluating the function are under different circumstances.

The pattern matching syntax above is shorthand for the following:

(* fac3.sml *)
fun fac n =
  case n of
    0 => 1
  | n => n * fac (n - 1);

Next up, I tried writing a function to join the elements of an array into a string. The function takes a function to convert each element to a string, a string to use a an element separator, and finally the list to join.

To avoid having to write any special purpose code, I decided to write a helper function that would take an extra parameter that would be prefixed onto the stringified list element. When the helper calls itself to cope with the list tail, it would then pass the separator argument as both the prefix and separator argument. When we’re initially calling the helper function, an empty string is passed in the prefix argument.

(* join1.sml *)
fun helper _ _ _ [] = ""
  | helper toString pre sep (h::t) = pre ^ (toString h) ^ (helper toString sep sep t);

fun join toString sep lst = helper toString "" sep lst;

A quick note on this: notice (h::t) in the helper function’s argument list. The :: operator is the list construction operator and in patterns can be used to decompose a list into a head element and a trailing list. As I haven’t mentioned it yet, ^ is the string concatenation operator.

Let’s try executing it in the mosml REPL:

- load "Int";
> val it = () : unit
- join Int.toString ", " [1, 2, 3, 4, 5];
> val it = "1, 2, 3, 4, 5" : string

We don’t really want the outside world to know about our helper functions. One way to hide them is to use a local pvtdecls in decls end block. This is particularly useful if the helpers are used by a number of different functions.

(* join2.sml *)
local
  fun helper _ _ _ [] = ""
    | helper toString pre sep (h::t) = pre ^ (toString h) ^ (helper toString sep sep t)
in
  fun join toString sep lst = helper toString "" sep lst
end;

Now only join can see helper. However, as join is the only function that needs to about helper, we can declare it within join using a let decls in expr end block:

(* join3.sml *)
fun join toString sep lst =
let
  fun helper _ _ _ [] = ""
    | helper toString pre sep (h::t) = pre ^ (toString h) ^ (helper toString sep sep t)
in
  helper toString "" sep lst
end;

Because functions declared within other functions are contained within the scope of their parent function, we don’t need to pass these values in, meaning we can simplify our helper function down like so:

(* join4.sml *)
fun join toString sep lst =
let
  fun helper _ [] = ""
    | helper pre (h::t) = pre ^ (toString h) ^ (helper sep t)
in
  helper "" lst
end;

I could’ve approached the join function differently. Rather than using a helper function an prefixing the separator on, I could’ve treated the separator as a suffix and omitted appending the suffix in the case of an one-element list:

(* join5.sml *)
fun join _ _ [] = ""
  | join toString _ [h] = toString h
  | join toString sep (h::t) = (toString h) ^ sep ^ (join toString sep t);

But then we wouldn’t have learned as much. [smile]

[h], by the way, is a pattern that matches a single element list.

That’s enough for now. I’ll talk a bit about the type system later and about records, exceptions, references, the imperative side of the language, the module system, and all of that when I get the notion. However, if you’ve understood everything so far, you understand a fair bit of the core language.

In the meantime, you might want to read Mads Tofte’s Tips for Computer Scientists on Standard ML (PDF), which is quite readable and easy to understand, or Stephen Gilmore’s Programming in Standard ML 97: A Tutorial Introduction, which I found more difficult to follow in places, but covers everything in much more detail.

Technorati Search Technorati Search Irish Bloggers

Comments

1 On March 18, 2008 at 14:26, Revence wrote:

Woo! :-) Another MLer! Me, I drap my SML/NJ along with me, everywhere. Everywhere, I said. I have never needed to turn things into raw binaries, but at least, then, MLton would play nice with SML/Nj’s stuff, even he extensions. Me, I use SML/NJ as a playground. And it is a rich one.

Here is a tip: Get rlwrap, so that you can get the readline bindings to work in SML/NJ. (Stuff can get angering when you, like me, compose an ML poem only to realise there in a char missing, and you have to twist to get it back. With rlwrap, you just act like SML/NJ understands readline.) I use that for OCaml, too.

Then, bind your shell’s sml alias to rlwrap sml.

Also, I was supposed to write the thingy for SML/NJ to be coloured - yep, on the terminal - by ACOC. This stuff of OCaml using ncurses to point out mistakes on the command line, it spoilt me. :-)

Wow. So much written and I didn’t even write a single “but in Haskell ...” I’ve grown up so much! :-)

2 On March 18, 2008 at 14:29, Revence wrote:

Using it as an excuse to correct my Site URL, and to wonder, just wonder, why you are doing SML. Nostalgia, academic, heck-of-it-ic, or work? Me, it’s academic. :) Data structures and the like.

3 On March 18, 2008 at 15:23, Keith wrote:

I’m messing with Standard ML pretty much because I want to. There’s no particular reason for it. I learned O’Caml ages back, but I could never get into writing Standard ML, but I thought I’d give it a try over the weekend. Simple as that really.

4 On March 18, 2008 at 15:30, Keith wrote:

Oh, and unless something’s changed in French since I learned it, you’re not able to drop the “que” in “La plus belle langage de programmation que je connais’. I’m also pretty sure that sentence should start with “C’est”.

5 On March 19, 2008 at 5:20, Revence wrote:

/me panics to fix the grammatical error in his weak French ...

I’m a bit sucky like that with the French, yet I want to write it sometimes. Even though that was meant to be the colloquial kind, I’m not good enough at it to pull that register off well.

You know, I don’t know when I switched from OCaml to SML, but I don’t think I’m going back. I toyed with SML back then, and ten went to OCaml. Then I came back. :o)

(And, damn it, did I ever tell you your website is pretty much the nicest design I’ve seen among tech blogs? When will people learn the value of such minimalist content-centric design?)

Post a comment

All form information is optional, but it’s a good idea to fill in your name and email address if you want me to take your comment seriously.

Spammers, don’t bother posting crap down here. The site is set up so that legitimate search engines (Google, for instance) won’t index pages with comments on them. Posting crud here only means you’re wasting my time and patience. Shoo!

Real names, please. Please include!
Won’t be displayed. Please include!
Displayed, if present.