After submitting my solution to Project Euler's problem 50 earlier today, I was scrolling through the problem's forums, taking a look at other folks' solutions/execution times.
After a while, I started to feel quite proud of my code which solved it in ~3 seconds (my code used the Primes
library and was compiled with O2
)...
...and then I came across the code below which solved it in ~0.05 seconds...in interpreted mode (i.e., ghci
).
Can someone explain how/why the code below solves this particular problem?
The mind-twisting part is the application of the tails
function to the infinite list of primes (primes
) within a list comprehension. I'm having a hard time understanding how we guarantee that we look at all possible sublists of consecutive primes and not just those generated by tails
.
(My usual strategy of trying bits and pieces of code in ghci
doesn't work in this situation because primes
is infinite...)
The problem: We're asked to find the largest prime number below 1,000,000 which is the result of summing consecutive prime numbers. For example, the largest prime number below 100 which is the sum of consecutive primes is 41 (2 + 3 + 5 + 7 + 11 + 13).
import Data.List (tails)
import Data.Numbers.Primes
under n xs = takeWhile (< n) xs
takeUntil p xs = foldr (\x r-> if p x then [x] else x:r) [] xs
res :: [((Int, Int), (Int, Int))]
-- ((top_length, sums_to), (total_length, starting_prime))
res = [(r,(length s,x)) | (x:xs) <- tails primes
, let s = zip [1..]
$ under 100
$ scanl (+) x xs
, let r = ...]
main = mapM_ print $ takeUntil ...
In general, there's no problem with taking
tails
ofprimes
, because Haskell is lazy (the evaluation is on-demand).Here specifically there's even less problem because each sublist is trimmed with
under 100 = takeWhile (< 100)
- only a finite prefix is taken.(x:xs) <- tails primes
just goes through all suffixes ofprimes
- i.e. primes starting fromx=2
; then primes starting fromx=3
, then5,7,11, ...
. The pattern only demands the head elementx
, and the tailxs
of the primes list, and even their values aren't immediately requested, only the so-called "spine" of primes list is forced, 1 notch at a time (of course making sureprimes
starts with at least one elementx
will automatically compute the actual value ofx
, but that's another matter).So
(x:xs) <- tails primes
operates on consecutive suffixes of the primes list. Trytails [1..10]
to see what's going on there.When you type
at the GHCi prompt, you're actually requesting all the elements of
primes
list to be printed as the output. Butwill only request those below 100, and not more than one element after that. It uses the built-in
takeWhile
that examines elements inprimes
, one after another, until one is found that fails the predicate (in this case, is bigger than 100). The terminating element is not included in the result.The user-defined
takeUntil
differs from that only in that it also includes the terminating element in its result (and the meaning of predicate is flipped - it signals when to stop).The
scanl (+)
is the usual way to calculate the sequence of partial sums of a sequence:The code
means:
So we get the list of entries
((top_length, sums_to), (total_length, starting_prime))
for each consecutive prime number,starting_prime = 2,3,5,7,11, ...
.The
takeUntil
expression inmain
determines when it's okay to stop, as there's no possibility of improving the result anymore.