How do I find the line of C++ which locks a Linux futex?

1k Views Asked by At

I've got a performance problem with a large application written in C++. The program uses only 150% CPU, while the server is a 24-core hyperthreaded EPYC and other, similar applications can reliably hit the expected 4800% CPU load. iotop shows virtually no I/O, which is expected.

As the program is apparently neither I/O-bound nor CPU-bound, I checked strace and found that the vast majority of traced calls are waits on a single futex. That is to say: 48 of the 50 threads in the program appear to lock the same futex, which explains quite well why the CPU load only barely exceeds 100%.

Example:

[pid 11581] futex(0x55acec47a900, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 11580] futex(0x55acec47a900, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 11579] futex(0x55acec47a900, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 11578] futex(0x55acec47a900, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 11577] futex(0x55acec47a900, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 11576] futex(0x55acec47a900, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>

Now the problem for me is: how do I find the offending code? The program is not deadlocks, just slow, so the usual techniques to find deadlocks do not work.

2

There are 2 best solutions below

0
On BEST ANSWER

The best way I found myself was to run the program in GDB. Since most threads are blocked, info threads will show most of the threads in the same state. For me, that happened to be blocked in __lll_lock_wait. Switching to any of these threads gave me a stacktrace showing how I ended up in __lll_lock_wait. Three levels up the stack I found my offending code.

0
On

How do I find the line of C++ which locks a Linux futex?

If you accept to change your C++ source code, you might compile it with g++ -O -g (so with DWARF debugging information) using a recent GCC compiler and use Ian Taylor's libbacktrace. That library gives nice backtracing information at runtime using DWARF debug info in your ELF executable and shared libraries.

Then you could either subclass the locking C++ classes (e.g. std::mutex or std::lock_guard) or add extra C++ code (perhaps with preprocessor X-macros) to use that backtracking library.

Also consider a profiling approach with GNU gprof.

Another possible approach (worthwhile only for a large code base of more than a hundred thousand lines of C++) might be to use dynamic linker tricks (e.g. LD_PRELOAD, see ld.so(8)) to redefine your C++ standard library, or to write your GCC plugin to modify the emitted code related to futex(7) or to locking C++ classes.

For a small code base, consider also writing your specialized metaprogram (in the spirit of Qt moc) to transform your C++ code (e.g. to automatically add C++ calls to libbacktrace functions) then update your build automation (e.g. your Makefile) to use it.

For a concrete example, look into the source code of GCC

Be however aware that deadlock or synchronization related bugs are typical heisenbugs.

So budget several weeks of debugging efforts.