Why does folding Events and Behaviors use so much memory?

811 Views Asked by At

I am currently exploring the possibility to use basic containers to give FRP networks more structure and by that to create more sophisticated event networks easier.

Note: I use ordrea but had the same problem with reactive-banana too, so I guess this problem is not specific to the chosen frp implementation.


In this special case I am using a simple Matrix to store Events:

newtype Matrix (w :: Nat) (h :: Nat) v a where
   Matrix :: Vector a -> Matrix w h v a

-- deriving instances: Functor, Foldable, Traversable, Applicative

Matrix is basically just a thin wrapper around Data.Vector and most functions I'll use are basically the same as the corresponding Vector ones. The notable exception is indexing, but that should be self explanatory.


With this I can define matrices of events like Matrix 10 10 (Event Double) and are able to define basic convolution algorithms on that:

applyStencil :: (KnownNat w, KnownNat h, KnownNat w', KnownNat h')
             => M.Matrix w' h' (a -> c)
             -> M.Matrix w h (Event a)
             -> M.Matrix w h (Event c)
applyStencil s m = M.generate stencil
  where stencil x y = fold $ M.imap (sub x y) s
        sub x0 y0 x y g = g <$> M.clampedIndex m (x0 - halfW + x) (y0 - halfH + y)
        halfW = M.width s `div` 2
        halfH = M.height s `div` 2

Notes:

  • M.generate :: (Int -> Int -> a) -> M.Matrix w h a and

    M.imap :: (Int -> Int -> a -> b) -> M.Matrix w h a -> M.Matrix w h b

    are just wrappers around Vector.generate and Vector.imap respectively.

  • M.clampedIndex clamps indices into the bounds of the matrix.
  • Event is an instance of Monoid which is why it is possible to just fold the Matrix w' h' (Event c) returned by M.imap (sub x y) s.

I have a setup approximately like this:

let network = do
  -- inputs triggered from external events 
  let inputs :: M.Matrix 128 128 (Event Double)

  -- stencil used:
  let stencil :: M.Matrix 3 3 (Double -> Double)
      stencil = fmap ((*) . (/16)) $ M.fromList [1,2,1,2,4,2,1,2,1]

  -- convolute matrix by applying stencil
  let convoluted = applyStencil stencil inputs

  -- collect events in order to display them later
  -- type: M.Matrix 128 128 (Behavior [Double])
  let behaviors = fmap eventToBehavior convoluted

  -- now there is a neat trick you can play because Matrix
  -- is Traversable and Behaviors are Applicative:
  -- type: Behavior (Matrix 128 128 [Double])
  return $ Data.Traversable.sequenceA behaviors

Using something like this I am triggering ~15kEvents/s with no problems and lots of headroom in that regard.

Problem is that as soon as I sample the network I can only get about two samples per second from it:

main :: IO ()
main = do

  -- initialize the network
  sample <- start network

  forever $ do

    -- not all of the 128*128 inputs are triggered each "frame"
    triggerInputs

    -- sample the network
    mat <- sample

    -- display the matrix somehow (actually with gloss)
    displayMatrix mat

So far I have made the following observations:

  • Profiling tells me that productivity is very low (4%-8%)
  • Most of the time is spend by the garbage collector in Gen 1 (~95%)
  • Data.Matrix.foldMap (ie fold) is allocating the most memory (~45%, as per -p)

  • When I was still working with reactive-banana Heinrich Apfelmus recommended that tree based traversals are a better fit for behaviors¹. I tried that for sequenceA, fold and traverse with no success.

  • I suspected that the newtype wrapper was preventing vectors fusion rules to fire². This is most likely not the culprit.

At this point I have spent the better part of the week searching for a solution to this problem. Intuitively I'd say that sampling should be much faster and and foldMap should not create so much garbage memory. Any ideas?

0

There are 0 best solutions below