Concurrent and Parallel Haskell

Concurrent and Parallel Haskell are Glasgow extensions to Haskell which let you structure your program as a group of independent `threads'.

Concurrent and Parallel Haskell have very different purposes.

Concurrent Haskell is for applications which have an inherent structure of interacting, concurrent tasks (i.e. `threads'). Threads in such programs may be required. For example, if a concurrent thread has been spawned to handle a mouse click, it isn't optional—the user wants something done!

A Concurrent Haskell program implies multiple `threads' running within a single Unix process on a single processor.

You will find at least one paper about Concurrent Haskell hanging off of Simon Peyton Jones's Web page.

Parallel Haskell, or as we call it GpH (Glasgow parallel Haskell), is about speed—spawning threads onto multiple processors so that your program will run faster. The `threads' are always advisory—if the runtime system thinks it can get the job done more quickly by sequential execution, then fine.

A Parallel Haskell program implies multiple processes running on multiple processors, under a PVM (Parallel Virtual Machine) framework. So you have to have PVM installed on your parallel machine or on your cluster. Luckily most recent Linux distribution already include PVM. So in the common case of running GpH on a so-called “Beowulf cluster”, a bunch of Linux-based PCs wired together via some reasonably fast network dedicated to parallel processing, you don't have to have anything special. An MPI interface is under development but not fully functional, yet.

Over the last years Parallel Haskell has matured from a prototype to a system that can handle parallel applications such as Lolita, a 47000 line Haskell program for natural language processing. However, it is still mainly a research system and there are many interesting language and system issues to investigate (any many cool PhD theses to be written!). However, at the moment (December 2000) we are also starting to look into industrial applications and how we can use GpH to make the lives easier for the poor programming slaves outside the academic circles. So expect new exciting developments real soon, provided our projects are funded, of course.

Check the GpH Page for more information on “GPH” (Haskell98 with extensions for parallel execution), the latest version of “GUM” (the runtime system to enable parallel executions) and papers on research issues. A list of publications about GPH and about GUM is also available from Simon's Web Page.

Some details about Parallel Haskell follow. Some useful information on parallel programming in Parallel Haskell can also be found in the GranSim User's Guide and in our on-line paper on Evaluation Strategies. For more information about concurrent Haskell, see .

Features specific to Parallel Haskell

The Parallel interface (recommended)

GpH provides two functions for controlling parallel execution, through the Parallel interface:

interface Parallel where
infixr 0 `par`
infixr 1 `seq`

par :: a -> b -> b
seq :: a -> b -> b

The expression (x `par` y) sparks the evaluation of x (to weak head normal form) and returns y. Sparks are queued for execution in FIFO order, but are not executed immediately. At the next heap allocation, the currently executing thread will yield control to the scheduler, and the scheduler will start a new thread (until reaching the active thread limit) for each spark which has not already been evaluated to WHNF.

The expression (x `seq` y) evaluates x to weak head normal form and then returns y. The seq primitive can be used to force evaluation of an expression beyond WHNF, or to impose a desired execution sequence for the evaluation of an expression.

For example, consider the following parallel version of our old nemesis, nfib:

import Parallel

nfib :: Int -> Int
nfib n | n <= 1 = 1
       | otherwise = par n1 (seq n2 (n1 + n2 + 1))
                     where n1 = nfib (n-1)
                           n2 = nfib (n-2)

For values of n greater than 1, we use par to spark a thread to evaluate nfib (n-1), and then we use seq to force the parent thread to evaluate nfib (n-2) before going on to add together these two subexpressions. In this divide-and-conquer approach, we only spark a new thread for one branch of the computation (leaving the parent to evaluate the other branch). Also, we must use seq to ensure that the parent will evaluate n2 before n1 in the expression (n1 + n2 + 1). It is not sufficient to reorder the expression as (n2 + n1 + 1), because the compiler may not generate code to evaluate the addends from left to right.

Underlying functions and primitives

The functions par and seq are wired into GHC, and unfold into uses of the par# and seq# primitives, respectively. If you'd like to see this with your very own eyes, just run GHC with the -ddump-simpl option. (Anything for a good time…)

Scheduling policy for concurrent threads

Runnable threads are scheduled in round-robin fashion. Context switches are signalled by the generation of new sparks or by the expiry of a virtual timer (the timer interval is configurable with the -C[<num>] RTS option). However, a context switch doesn't really happen until the current heap block is full. You can't get any faster context switching than this.

When a context switch occurs, pending sparks which have not already been reduced to weak head normal form are turned into new threads. However, there is a limit to the number of active threads (runnable or blocked) which are allowed at any given time. This limit can be adjusted with the -t<num> RTS option (the default is 32). Once the thread limit is reached, any remaining sparks are deferred until some of the currently active threads are completed.

Scheduling policy for parallel threads

In GUM we use an unfair scheduler, which means that a thread continues to perform graph reduction until it blocks on a closure under evaluation, on a remote closure or until the thread finishes. See the section called RTS options for Concurrent/Parallel Haskell in Chapter 3 for a discussion of GHC's runtime-system options to control its parallel behaviour.