Chance Coble

Functional Programming in Austin, Texas

First Class Functions:2 Multiprogramming

Posted by Chance Coble on December 22, 2007

This is the final post of the series. Part zero was a simple illustration of the flexibility of first class functions. In part one, the iter, map and fold combinators were used to manipulate lists (and implicitly arrays and seq type objects). That was fun to go over, but not nearly as fun as this topic will be.

Over the past ten years, one may notice that the pace of processor clock speed improvements has diminished. The introduction of multi core technology is a promising innovations. I am not a computer engineer, so my thoughts don’t wander to the technical details that had to be surmounted in order to get multiple cores in a single processing unit. I am a programmer, so they wander to the complexity of concurrent computing. A complexity which, with this multicore technology, has now been placed at every programmers feet.

Simon Peyton-Jones puts it quite well in this way [I am paraphrasing]. Consider a queue, the simple structure undergraduates can create. Creating a queue with enqueue and dequeue operations is a basic undergraduate assignment. However, create a concurrent queue (in which the head and tail have separate locking mechanisms - because locking the whole thing would be inefficient) and you have a scientifically publishable result. The sequential implementation is as simple as problems come, and the concurrent version is as difficult as problems come. For the time being we will continue to use these methods, but there is no doubt that more will need to be invented. The reliability of software is sure to hit a wall (in fact I would argue it already has), if the efficiency is to be improved without simpler methods.

There are a lot of approaches being developed to solve these problems. Research today is on fire with new techniques to attack this complexity. But this is a basics post, and many of the solutions get into some advanced uses of functional languages. Also, each language tends to come with its own bag of tricks for this dilemma. Fortunately, there is a very easy way to get access to threading processes in F# using the .NET platform, and I thought a little example code would be good for people just cutting their teeth on functional programming.

Consequently, the examples below will not attack the whole world of concurrency today. In fact the scope will be quite small from the standpoint of concurrency. What I want to highlight here is the continued benefit of using higher order functions, and the simplicity they can bring to complex problems. The example below will split an array into chunks of work and then apply map and fold operators, specifically designed to be parallel, to do this work.

Today’s project starts with a couple of helper functions. One function, run, takes a function and executes it on a new thread. Another function unzip will take an array and split it up in to a given number of smaller arrays. The definitions are below.


let run f =
    let thread  = new System.Threading.Thread(new System.Threading.ThreadStart(f)) in
    thread.Start();
    thread

// Every `num` element of a list gets added to the `num` array
let unZip (arr:'a array) num =
    let vals = Array.create num [] in
    Array.iteri (fun i x -> vals.[i%num] <- vals.[i % num] @ [i]) arr;
    vals;

With these, the parallel map and parallel fold functions can be implemented. Each of these takes an array, unzips it to a given number and then executes the given function on a thread with our run function.


// Using n threads, execute the function f on every element of arr thus creating a new array
// where every element x is mapped to f x
let pMap n f (arr:'a array) =
    let vals = unZip arr n in
    let b = Array.create arr.Length arr.[0]  in
    let f' x () = List.iter (fun i -> b.[i] <- f arr.[i]) x in
    let threads = Array.map (fun x -> run (f' x)) vals in
    threads |> Array.iter (fun t -> t.Join()) ;
    b

// Using n threads, compose (f start) on to every element leading to n compositions
// thus pFold n f start arr = [ (..(((f start) arr.[0]) arr.[n-1]) ... arr[m-1]); ..;..;]
let pFold n f start arr =
    let vals = unZip arr n in
    let b = Array.zero_create n in
    let f' i x () = b.[i] <- List.fold_left (fun acc k -> f acc arr.[k]) start x in
    let threads = Array.mapi (fun i x -> run (f' i x)) vals in
    threads |> Array.iter (fun t -> t.Join());
    b

The nice thing about this is that these functions can be customized to more specific implementations. Here is something similar to what I would typically do in an API that might be used by others.


type MyComputer = class
       static member Fold f start arr = pFold System.Environment.ProcessorCount f start arr
       static member Map f arr = pMap System.Environment.ProcessorCount f arr
     end

The type MyComputer now encapsulates map and fold methods which are specific to the environment on which they are being run. The number of processors is provided as the number of threads. As a simple example I will create a program which takes an array of web addresses and counts each instance of each character on the web pages. One can intuitively grasp this as something very well suited to a combination of map and fold operations.First a dictionary is setup which can be used by multiple threads. We will use the lock function to ensure that addition of a character’s count, and the addition of a new character are processed atomically.


open System.IO
open System.Net
open System.Collections.Generic;  

let dictionary = new Dictionary<char,int>()

let addDictionaryEntry x n =
    lock dictionary
      (fun () ->
         if dictionary.ContainsKey(x) then
             dictionary.[x] <- dictionary.[x] + n
         else dictionary.Add(x,n))
    dictionary

Below are two very simple helper functions to retrieve a web page’s text and another adds a count to each character in a string using the dictionary above.


let getPage (x:string) =
                      use stream = WebRequest.Create(x).GetResponse().GetResponseStream()
                      use reader = new StreamReader(stream)
                      reader.ReadToEnd() 

let sumChars d (s:string) =
     let carr = s.ToCharArray()
     Array.iter (fun x -> addDictionaryEntry x 1 |> ignore) carr
     d

Creating the actual program is a snap now. I will first define the single threaded version.


let pageMap = Array.map getPage
let pageChars (xs:string array) = Array.fold_left sumChars dictionary xs
let runSingle pages = pageMap pages |> pageChars

This is for comparison, both in timing and in the similarity of the code. Now the parallel version.


let pPageMap = MyComputer.Map getPage
let pPageChars (xs:string array) = MyComputer.Fold sumChars dictionary xs
let runParallel pages = pPageMap pages |> pPageChar

That alone illustrates the power of these techniques quite sharply. The code is stunningly similar. I will create an array four web pages called webs


let webs = [|"http://www.cnn.com/index.html";
             "http://www.npr.org/index.html";
             "http://www.amazon.com/gp/homepage.html";
             "http://www.acm.org/index.html"|]

[Note: You can time operations in the interactive session by typing "#time" which toggles the setting to print the time an operation took :End Note] My workstation has a dual core setup, so the number of processors the environment returns is 2. Running the single threaded version with this (runSingle webs) yields a time of roughly 7.5 seconds. Running the parallel operation yields a time of roughly 5.5 seconds. Of course, the time was not cut down by as much as possible because this is not simply a CPU processing problem. First the web page request is I/O bound, and then the character counting is CPU bound (although the second is more difficult to see a benefit in concurrent processing because it is so fast). My machine attacks this with two threads, which is ideal for CPU bound operations. But more threads would be better for I/O bound operations. Setting this up with our map and fold methods running on more threads is very simple.


let customParallel (pages:string array) =
                     (pMap pages.Length getPage pages) |>
                     (pFold pages.Length sumChars dictionary)

Again we have very similar code, except this time the length of the array is used to specify the number of threads. Running this on my workstation (customParallel webs) yields the operation in 3.8 seconds.Of course, there are a number of things this simple post still does not address. This does not help us with Simon Peyton-Jones’s queue problem. It also does not free us from locking in general. What I really want to highlight here is that I was able to quite simply compose programs such that only part of the program had to deal with locking and concurrency. It was a low level detail that did not affect how I would organize larger solutions. I wrote a separate program to deal with concurrency, which I then pass programs to in order to compute them in parallel.For more on higher order functions and the functional style of programming in F# I would recommend both the Foundations and the Expert F# books by Apress. Jon Harrop’s F# Journal includes articles which are both thorough and clear, making them valuable to a wide audience. For more on concurrency in F# [the topic is rich] the Expert F# book has a very nice chapter dedicated to it. I believe that both Don Syme’s blog as well as Robert Pickering’s (foundations author) blog are great for examples [in fact Robert Pickering just finished a great series wholly related to concurrency].I also will recommend a couple of papers for the more motivated reader related to concurrency and functional programming during the holiday break. First, if you haven’t read it the MapReduce article from google is a lot of fun. Though they implemented it in C++ they designed it from functional principles.The unzip method used here is a distant reminder of a technique from a paper I read back in college. The paper was the first real groundbreaking moment for me that functional programming isn’t just a part of a new language, it is a new way to think about programming.Complete code listing:

#light

open System.IO
open System.Net
open System.Collections.Generic;  

let run f =
    let thread  = new System.Threading.Thread(new System.Threading.ThreadStart(f)) in
    thread.Start();
    thread

// Every `num` element of a list gets added to the `num` array
let unZip (arr:'a array) num =
    let vals = Array.create num [] in
    Array.iteri (fun i x -> vals.[i%num] <- vals.[i % num] @ [i]) arr;
    vals;

// Using n threads, execute the function f on every element of arr thus creating a new array
// where every element x is mapped to f x
let pMap n f (arr:'a array) =
    let vals = unZip arr n in
    let b = Array.create arr.Length arr.[0]  in
    let f' x () = List.iter (fun i -> b.[i] <- f arr.[i]) x in
    let threads = Array.map (fun x -> run (f' x)) vals in
    threads |> Array.iter (fun t -> t.Join()) ;
    b

// Using n threads, compose (f start) on to every element leading to n compositions
// thus pFold n f start arr = [ (..(((f start) arr.[0]) arr.[n-1]) ... arr[m-1]); ..;..;]
let pFold n f start arr =
    let vals = unZip arr n in
    let b = Array.zero_create n in
    let f' i x () = b.[i] <- List.fold_left (fun acc k -> f acc arr.[k]) start x in
    let threads = Array.mapi (fun i x -> run (f' i x)) vals in
    threads |> Array.iter (fun t -> t.Join());
    b

type MyComputer = class
       static member Fold f start arr = pFold System.Environment.ProcessorCount f start arr
       static member Map f arr = pMap System.Environment.ProcessorCount f arr
     end

let dictionary = new Dictionary&lt;char,int&gt;()

let addDictionaryEntry x n =
    lock dictionary
      (fun () -&gt;
         if dictionary.ContainsKey(x) then
             dictionary.[x] &lt;- dictionary.[x] + n
         else dictionary.Add(x,n))
    dictionary     

let webs = [|"http://www.cnn.com/index.html";
             "http://www.npr.org/index.html";
             "http://www.amazon.com/gp/homepage.html";
             "http://www.acm.org/index.html"|]

let getPage (x:string) =
                      use stream = WebRequest.Create(x).GetResponse().GetResponseStream()
                      use reader = new StreamReader(stream)
                      reader.ReadToEnd() 

let sumChars d (s:string) =
     let carr = s.ToCharArray()
     Array.iter (fun x -> addDictionaryEntry x 1 |> ignore) carr
     d

let pageMap = Array.map getPage
let pageChars (xs:string array) = Array.fold_left sumChars dictionary xs
let runSingle pages = pageMap pages |> pageChars

let pPageMap = MyComputer.Map getPage
let pPageChars (xs:string array) = MyComputer.Fold sumChars dictionary xs
let runParallel pages = pPageMap pages |> pPageChars

let reallyParallel (pages:string array) = (pMap pages.Length getPage pages) |&gt; pFold pages.Length sumChars dictionary

5 Responses to “First Class Functions:2 Multiprogramming”

  1. Mostafa Elhemali Says:

    I’m going through your code today (excellent material for an F# newbie) and I think unZip has a minor bug:
    Array.iteri (fun i x -> vals.[i%num] vals.[i%num] <- vals.[i % num] @ [x]) arr;

  2. Random Reading over Christmas Week « Tales from a Trading Desk Says:

    [...] Chance Coble on F# Multiprogramming [...]

  3. Formal Analysis of Unzip « Chance Coble Says:

    [...] by Chance Coble on January 6, 2008 On a previous post I wrote a function unzip which was applied to an array and an integer and returned a list of [...]

  4. New F# Developer Says:

    Your example works when the function performing the mapping returns the same type as the input parameter type (in this case a string). Can you please give an example when the mapping function returns a different type?

  5. medshiemn Says:

    Thanks for the post

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>