Interpreting a cliff edge in GHC heap profiles

133 Views Asked by At

I need some help understanding some GHC heap profiles. I'm working on a new output format for an existing compiler for the Grammatical Framework language.

Before

Profiling a normal run of the current version of the compiler looks like this:

heap profile for PGF

  13,347,660,488 bytes allocated in the heap
 213,062,924,208 bytes copied during GC
     740,585,528 bytes maximum residency (567 sample(s))
       4,844,112 bytes maximum slop
            1438 MiB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     12312 colls,     0 par    1.321s   1.382s     0.0001s    0.0025s
  Gen  1       567 colls,     0 par   206.442s  208.534s     0.3678s    0.8388s

  INIT    time    0.001s  (  0.005s elapsed)
  MUT     time   68.757s  ( 68.986s elapsed)
  GC      time  109.557s  (111.086s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time   98.206s  ( 98.830s elapsed)
  EXIT    time    0.000s  (  0.001s elapsed)
  Total   time  276.521s  (278.908s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    194,128,534 bytes per MUT second

  Productivity  60.4% of total user, 60.2% of total elapsed

After

With the changes I've made, it now looks like this:

heap profile for LPGF

  15,996,529,552 bytes allocated in the heap
 216,727,207,600 bytes copied during GC
     889,791,520 bytes maximum residency (578 sample(s))
       5,417,952 bytes maximum slop
            1741 MiB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     14915 colls,     0 par    1.641s   1.721s     0.0001s    0.0014s
  Gen  1       578 colls,     0 par   226.744s  230.323s     0.3985s    1.1982s

  INIT    time    0.001s  (  0.005s elapsed)
  MUT     time   70.046s  ( 70.242s elapsed)
  GC      time  120.183s  (122.695s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time  108.201s  (109.349s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time  298.431s  (302.291s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    228,370,682 bytes per MUT second

  Productivity  59.7% of total user, 59.4% of total elapsed

So this new version uses more time and memory than the previous one, and my goal is to reduce those as much as possible.

  • The first part of both graphs looks the same, this stage of the compiler (namely reading binary files) has not been touched and thus I would expect it to look identical.
  • After that, the graphs are hugely different. I feel like the huge cliff edge in graph #2 is telling me something, but I don't quite know what.
  • The blue area dominating the second half of graph #2 (24956) is my newly added code. As I haven't aggressively optimised that code, I would expect there to be scope for improvement there. But do these different graphs give any hints about laziness/strictness or where to investigate further?
1

There are 1 best solutions below

2
On

That profile shape indicates a large graph of values became unreachable suddenly. That could be a bunch of thunks produced accidentally. Or it could be some sort of intermediate structure that was evaluated and used properly, but eventually was no longer needed and released.

But the one thing you can say for sure is that it isn't the result of a mass of unevaluated thunks being built up and then traversed during evaluation. That has a shape like your first chart - an upward trend while allocating and then a downward trend while evaluating.

A cliff says that something big became suddenly unreachable. As for what? Who knows. We'd need to see code.