I am confused by the behaviour of the following snipped:
import Data.Int
import Data.Array.ST
import Control.Monad.ST
{-# INLINE fib #-}
fib _ 0 = return 0
fib _ 1 = return 1
fib c n = do
f1 <- memo c (fib c) (n-1)
f2 <- memo c (fib c) (n-2)
return (f1+f2)
newtype C a = C a
{-# INLINE memo #-}
memo (C a) f k = do
e <- readArray a k
if e == (-1)
then do
v <- f k
writeArray a k v
return v
else return e
evalFib :: Int -> Int
evalFib n = runST $ do
a <- newArray (0,n) (-1) :: ST s (STUArray s Int Int)
fib (C a) n
main = print $ evalFib 120000
When compiled with -O2
it stack-overflows (showing 20M of memory in use). The confusing part is that it actually works as expected (no stack overflow and 9M memory in use) if any of the following changes is made:
Int64
is used instead ofInt
: (givingevalFib :: Int64 -> Int
andSTUArray s Int64 Int
). In fact anyInt*
(Int32
,Int16
, etc) will do the trick as well asWord
orWord*
;newtype C a
is removed from the picture;data C a = C !a
is used instead ofnewtype
I am trying to get my head around this behaviour: is it a bug in GHC/array module (it shows identical behaviour in 7.4.2
and 7.6.2
) or is it supposed to work that way?
PS The funny thing is that when I try to compile it with ghc array.hs -O2 -fext-core
to see the differences in core produced both GHC versions fail with "ghc: panic! (the 'impossible' happened)". No luck here either..
Looking at the core from 7.6.1, with
-O2
and-dsuppress-uniques
, the function that does the work,Main.main_$spoly_$wa
is structurally (almost) identical whether I useint
orInt64
as the index type. Since the core is long and complicated, here's thediff
output:Different index types, these are of course different.
For index type
Int
, GHC produces somewhat more informative errors for out-of-bounds indices, for that it needs the lower and upper bound of the valid indices. (The default implementation ofindex
is not overridden in theinstance Ix Int64
.)Different error,
indexError
vs.hopelessIndexError
. The following differences also only concern index errors.Now once more the different index type:
And finally,
0
and1
have gotten different top-level names.So the entire code that does the actual work is identical. Yet the one causes a stack overflow (though only just,
-K9M
suffices [-K8731K
is enough here,-K8730K
not]) and the other doesn't.The difference is indeed caused by the index errors. The code with
Int
indices allocates two boxedInt
s in every recursive call that theInt64
code doesn't allocate, becausecarries around two references to the array.
That uses more stack, and these boxed
Int
s have to be garbage collected, which causes the much larger GC figures. Additionally, the thunk for the index error is a bit larger than thehopelessIndexError
thunk.Now, if you help the compiler by
data C a = C !a
)or some other ways, it produces better code that manages without a stack overflow for the given argument, since there is only one reference to the array in the worker, and thus the allocation of the boxed
Int
s for the bounds isn't needed.Note that this algorithm causes a stack overflow for slightly larger arguments even with the help for the compiler.