Taming parallelism in Haskell/GHC

125 Views Asked by At

A question about making parallelism work effectively, asked by a Haskell newbie.

The Advent of Code Day 14 challenge involves creating MD5 hashes of a sequence of integers, looking for the first n integers that give hashes fulfilling certain properties. I do this essentially by creating the hashes then filtering them.

I thought that this would be a good thing to try with parallelism, using several cores to generate the hashes.

The non-parallel version of hash creation looks like this:

md5sequenceS :: [String]
md5sequenceS = [makeMd5 i | i <- [0..]]
    where makeMd5 i = stretch $ getHash (salt ++ show i)
          stretch h0 = foldr (\_ h -> getHash h) h0 [1..2016]

...and it works fine, albeit slowly, giving an answer in about four minutes.

The parallel version looks like this:

md5sequenceS :: [String]
md5sequenceS = parMap rdeepseq (makeMd5) [0..]
    where makeMd5 i = stretch $ getHash (salt ++ show i)
          stretch h0 = foldr (\_ h -> getHash h) h0 [1..2016]

...the same as before apart from the parMap rdeepseq bit. This doesn't work fine: it consumes all the available memory on my machine, and still fails to produce an answer after 30 minutes of wall time. It does, however, use all the processors fully.

What should I do to tame this out-of-control parallelism?

(The problem spec doesn't give any clue for how many hashes I need to generate, but it turns out I need about 30,000 integers hashed.)

Edit to include accepted answer

The parBuffer strategy can be used as

md5sequenceS = withStrategy (parBuffer 100 rdeepseq) $ map (makeMd5) [0..]
    where makeMd5 i = stretch $ getHash (salt ++ show i)
          stretch h0 = foldr (\_ h -> getHash h) h0 [1..2016]

Performance isn't great compared to the single-threaded version, but that's a different question...

1

There are 1 best solutions below

1
On BEST ANSWER

parMap will force the evaluation of all the list, which in your case is infinite.

Instead of using parMap, you may consider using other strategies such as parBuffer which is able to deal with infinite lists.