I am wondering how to improve the time performance of the white noise addition to the logistic map? The noise is only allowed to be added after calculating the values (as it is an iterative map.)
module Generating
import System.Random (Random,randomR,random,mkStdGen,StdGen)
import Data.Random.Normal (mkNormals,normal)
import qualified Data.Vector.Unboxed as U
import Control.Monad.State
genR :: (Random a, Fractional a) => Int -> (a, StdGen)
genR x = randomR (0,1.0) (mkStdGen x)
new ::Double-> Double ->Int -> (Double,Int) -> U.Vector (Double,Int)
new skal r n = U.unfoldrN n go
where
go (x0,g0) = let !eins= (1.0-x0)
!x=x0 `seq` eins `seq` r*x0*eins
!g=g0+1
!noise= skal*(fst $ genR g)
in Just ((x0+noise,g0),(x,g))
fs :: (t, t1) -> t
fs (x,y)=x
first :: U.Vector (Double,Int)->U.Vector Double
first =U.map (\(x,y)->x)
As you can see, I actually only want the first value of the tuple, but the generator needs to be updated.
Any suggestions? Maybe State Monads?
tl;dr: Don't try to optimize a Haskell program without using profiling and benchmarking. Adding random exclamation marks and
seq
s is almost never going to work. The big problem here, in fact, is thatStdGen
is an incredibly slow random number generator and it's completely dominating the execution time of your program. You need to replace it to make any significant progress.Here's the longer answer: A good first step is to install a benchmarking library, like
criterion
, and write a test case:In my case, the results look like:
so we have about 8 milliseconds per run to generate a 10000-element vector.
Now, let's get rid of all the bangs,
seq
s and intermediate calculations you added to try to speed things up:Rerunning, here are the results:
Ah, so they had no effect at all. Glad we got rid of them.
Now, it's worth noting that
unfoldrN
is carrying along the accumulator for you which holds on to yourg
. You don't also need to includeg
in the result if you're going to throw it away anyway, so we can simplifynew
to:and drop the
first
call from the definition ofvect1
:This gives:
so it didn't really make a difference. Undoubtedly, the compiler was able to optimize away the useless extra
Double
s anyway, so changing the code had no effect.A more serious problem with the algorithm is that it uses generators in a really weird way. A
StdGen
is meant to be seeded and then reused to generate multiple random numbers, not to be generated fresh from a seed based on a counter. We really ought to rewritenew
to use the generator properly:though again, this makes almost no difference to our benchmarking times. I'll admit that this one surprised me. I thought it would have a significant effect. Good thing I was benchmarking these changes so I had actual objective evidence of the effect (or lack of effect) of this change!
Now, it seems like it's probably time to profile our program and see what it's spending its time doing.
If you look at the
Generating.prof
file that's output, you'll see that the majority of time is spent inSystem.Random
, like so:It turns out that Haskell's standard random number generator is appallingly slow, and we'll need to replace it with something faster to make any more progress.
The
mersenne-random-pure64
package provides a fast Mersenne Twister implementation that produces high quality random numbers, and we can rewritenew
to use it. Note thatrandomDouble
returns a uniform random number in the interval[0,1)
:Re-benchmarking (recompiled without profiling) gives:
Note that that's 107 microseconds, so it's about 75 times faster.
That's where I'd stop, but if you decide to continue optimizing, make sure you refer frequently to the profiling and benchmarking results to make sure your changes are having an effect.
I highly recommend Googling for "profiling haskell programs" and for the "criterion" library and taking some time to learn how to use these tools.
For reference, the final program is: