Project euler 10 - [haskell] Why so inefficient?

768 Views Asked by At

Alright, so i've picked up project euler where i left off when using java, and i'm at problem 10. I use Haskell now and i figured it'd be good to learn some haskell since i'm still very much a beginner.

http://projecteuler.net/problem=10

My friend who still codes in java came up with a very straight forward way to implement the sieve of eratosthenes:

http://puu.sh/5zQoU.png

I tried implementing a better looking (and what i thought was gonna be a slightly more efficient) Haskell function to find all primes up to 2,000,000. I came to this very elegant, yet apparently enormously inefficient function:

primeSieveV2 :: [Integer] -> [Integer]
primeSieveV2 [] = []
primeSieveV2 (x:xs) = x:primeSieveV2( (filter (\n -> ( mod n x ) /= 0) xs) )

Now i'm not sure why my function is so much slower than his (he claim his works in 5ms), if anything mine should be faster, since i only check composites once (they are removed from the list when they are found) whereas his checks them as many times as they can be formed.

Any help?

3

There are 3 best solutions below

0
On

The time complexity of your code is n2 (in n primes produced). It is impractical to run for producing more than first 10...20 thousand primes.

The main problem with that code is not that it uses rem but that it starts its filters prematurely, so creates too many of them. Here's how you fix it, with a small tweak:

{-# LANGUAGE PatternGuards #-}
primes = 2 : sieve primes [3..] 

sieve (p:ps) xs | (h,t) <- span (< p*p) xs = h ++ sieve ps [x | x <- t, rem x p /= 0]
                                               -- sieve ps (filter (\x->rem x p/=0) t)
main = print $ sum $ takeWhile (< 100000) primes

This improves the time complexity by about n1/2 (in n primes produced) and gives it a drastic speedup: it gets to 100,000 75x faster. Your 28 seconds should become ~ 0.4 sec. But, you probably tested it in GHCi as interpreted code, not compiled. Marking it1) as :: [Int] and compiling with -O2 flag gives it another ~ 40x speedup, so it'll be ~ 0.01 sec. To reach 2,000,000 with this code takes ~ 90x longer, for a whopping ~ 1 sec of projected run time.

1) be sure to use sum $ map (fromIntegral :: Int -> Integer) $ takeWhile ... in main.

see also: http://en.wikipedia.org/wiki/Analysis_of_algorithms#Empirical_orders_of_growth

7
On

To actually implement the sieve efficiently in Haskell you probably need to do it the Java way (i.e., allocate a mutable array an modify it).

For just generating primes I like this:

primes = 2 : filter (isPrime primes) [3,5 ..]
  where isPrime (p:ps) x = p*p > x || x `rem` p /= 0 && isPrime ps x

And then you can print the sum of all primes primes < 2,000,000

main = print $ sum $ takeWhile (< 2000000) primes

You can speed it up by adding a type signature primes :: [Int]. But it works well with Integer as well and that also gives you the correct sum (which 32 bit Int will not).

See The Genuine Sieve of Eratosthenes for more information.

5
On

You don't actually have a sieve here. In Haskell you could write a sieve as

import Data.Vector.Unboxed hiding (forM_)
import Data.Vector.Unboxed.Mutable
import Control.Monad.ST (runST)
import Control.Monad (forM_, when)
import Prelude hiding (read)

sieve :: Int -> Vector Bool
sieve n = runST $ do
  vec <- new (n + 1) -- Create the mutable vector
  set vec True       -- Set all the elements to True
  forM_ [2..n] $ \ i -> do -- Loop for i from 2 to n
    val <- read vec i -- read the value at i
    when val $ -- if the value is true, set all it's multiples to false
      forM_ [2*i, 3*i .. n] $ \j -> write vec j False
  freeze vec -- return the immutable vector

main = print . ifoldl' summer 0 $ sieve 2000000
  where summer s i b = if b then i + s else s

This "cheats" by using a mutable unboxed vector, but it's pretty darn fast

$ ghc -O2 primes.hs
$ time ./primes
  142913828923
  real: 0.238 s

This is about 5x faster than my benchmarking of augustss's solution.