The groups at Glasgow and Durham will continue to use evaluation strategies to write large parallel programs, and we hope to encourage others to use them too. To date we have only demonstrated modest wall-clock speedups for real programs, although this is partially due to the limited machine resources available to us. Several of the parallel functional implementations outlined in Section * achieve rather larger speedups. We would like to port the GUM runtime system underlying GpH to a larger machine, with a view to obtaining larger speedups. Another plausible target for GpH programs in the near future are modestly parallel workstations, with 8 processors for example. Interestingly it has required remarkably little effort to gain acceptable parallelism even for large, irregular programs like Lolita.
Initial performance measurements show that strategic code is as efficient as code with ad hoc parallelism and forcing functions, but more measurements are needed to confirm that this is true in general.
A framework for reasoning about strategic functions is under development. Proving that two strategic functions are equivalent entails not only proving that they compute the same value, but also that they have the same evaluation degree and parallelism/sequencing. The evaluation-degree of a strategic function can be determined adding laws for par and seq to existing strictness analysis machinery, e.g. Hughes and Wadler's projection-based analysis [Wadler and Hughes, 1987]. As an operational aspect, parallelism/sequencing are harder to reason about. At present we have a set of laws (e.g. both par and seq are idempotent), but are uncertain of the best framework for proving them. One possible starting point is to use partially ordered multisets to provide a theoretical basis for defining evaluation order [Hudak and Anderson, 1987]. Some support for evaluation strategies could be incorporated into the language. If the compiler was able to automatically derive rnf from a type definition, the work involved in parallelising a large application would be dramatically reduced, and the replication of libraries could be avoided. Some form of tagging of closures in the runtime system could reduce the execution overhead of strategies: a data structure need not be traversed by a strategy if its evaluation degree is already at least as great as the strategies.
We would like to investigate strategies for strict parallel languages. Many strict functional languages provide a mechanism for postponing evaluation, e.g. delay and force functions. The question is whether cost of introducing explicit laziness outweighs the benefits gained by using strategies.
Our long term goal is to support more implicit parallelism. Strategies provide a useful step towards this goal. We are learning a great deal by explicitly controlling dynamic behaviour, and hope to learn sufficient to automatically generate strategies with good dynamic behaviour for a large class of programs. One promising approach is to use strictness analysis to indicate when it is safe to evaluate an expression in parallel, and granularity analysis to indicate when it is worthwhile. It may be possible to use a combined implicit/explicit approach, i.e. most of a program may be adequately parallelised by a compiler, but the programmer may have to parallelise a small number of crucial components.