Equivalent of a heap profiler but for the stack?

87 Views Asked by Vitali At 10 October 2023 at 01:39

I'm hitting a stack overflow where a (Rust default) stack size of 2 MiB is insufficient and a very basic piece of code crashes. If I set RUST_MIN_STACK=4159616, then it works as expected. RUST_MIN_STACK=4159615 fails with a stack overflow.

The call chain is not very deep which makes me think something is allocating a lot of space on the stack before the crashing code, but I'm not sure what (this is within my unit test startup code which is trying to set up the Glommio async runtime). I've printed the size of the variable that the code is copying on the stack and it's 350 KiB - surprisingly large (it's a Rust closure) but also shouldn't be an issue if there's no other major stack allocation.

Is there any tool / technique to help narrow down what's causing such large stack usage? I've tried valgrind/massif & asan but they haven't told me any extra information as they really only seem to be implemented for heap allocations.

To add to the mystery, while the LTO release build crashes, it needs less additional space (RUST_MIN_STACK=2774720). This implies that whatever I'm doing is causing the debug build to spill ~1.38 MiB onto the stack.

Update:

I'm still not sure where the stack allocations are coming from, but it looks like the problematic pattern is:

fn run_async<T>(future: impl std::future::Future<Output = T>) -> T {
  let ex = glommio::LocalExecutorBuilder::new(...).make().unwrap();
  ex.run(future)
}

#[test]
fn mytest() {
  // There's nothing before run_async.
  run_async(async move {
     // test code
  });
}

Even though I printed the size of the closure in run_async and it was only 350 KiB directly, there seems to be hidden stack usage being generated. I think maybe by the async machinery because while:

#[test]
fn mytest() {
  // There's nothing before run_async.
  run_async(async move {
     eprintln!(
         "closure enter with stack remaining {:?}",
         stacker::remaining_stack()
             .map(|s| MemorySize::new(s))
             .unwrap_or_else(|| MemorySize::new(1))
     );
     let _file = File::open_or_create("/dev/null").await.unwrap();
  });
}

uses almost no stack (1916112 bytes remaining from 2 MiB limit),

#[test]
fn mytest() {
  // There's nothing before run_async.
  run_async(async move {
     eprintln!(
         "closure enter with stack remaining {:?}",
         stacker::remaining_stack()
             .map(|s| MemorySize::new(s))
             .unwrap_or_else(|| MemorySize::new(1))
     );
     let file = File::open_or_create("/dev/null").await.unwrap();
     file.write(...).await.unwrap();
  });
}

uses nearly 1 MiB of stack according to stacker (1196432 bytes remaining).

As I add more async operations within the async method under test, the stack size implied by the future grows for some reason until it's too large to fit on the stack.

I've fixed it by allocating the future via run_async(Box::pin(async move { ... code ... })) but I'm still not clear about how to discover ahead of time when I have huge futures / when they're causing a lot of spillage onto the stack for some reason.

Original Q&A

Equivalent of a heap profiler but for the stack?

There are 0 best solutions below

Related Questions in RUST

Related Questions in STACK-OVERFLOW

Related Questions in VALGRIND

Related Questions in SANITIZER

Trending Questions

Popular # Hahtags

Popular Questions