Title: Colocation of Potential Parallelism in a Distributed Adaptive Run-time System for Parallel Haskell
14:15 – 15:15, Wednesday 31 October
This talk presents a novel variant of work stealing for load balancing in a distributed graph reducer,
executing a semi-explicit parallel dialect of Haskell. The key concept of this load-balancer is colocating
related sparks (potential parallelism) using maximum prefix matching on the encoding of the
spark’s ancestry within the computation tree, recorded at run time, in spark selection decisions.
We evaluate performance and scalability of spark colocation on five divide-and-conquer-parallel
benchmarks on a Beowulf-class cluster of multi-core machines using up to 256 cores. We achieve
a speedup increase of up to 45.81% for three out of five applications due to improved
load balance throughout the execution as demonstrated by profiling data. Overall, spark colocation
results in reduced mean time to fetch the required data and in higher degree of parallelism of finer
granularity, which is most beneficial on higher PE numbers.