When I use gdb to debug futex locks with output, the program gets stuck in a weird loop.
#include <mutex>
#include <iostream>
#include <thread>
#include <unistd.h>
volatile int counter(0); // non-atomic counter
std::mutex mtx;
void increases10k(){
for(int i=0;i<100000000;i++){
sleep(1);
std::cout << "The ID of this thread is: " << std::this_thread::get_id() << std::endl;
std::cout << counter <<std::endl;
mux.lock();
++counter;
std::cout << counter <<std::endl;
mtx.unlock();
}
}
int main(int argc,char **argv){
std::thread threads[10];
for(int i=0;i<10;i++)
threads[i]=std::thread(increases10k);
for(auto& th:threads)
th.join();
std::cout << " successful increases of the counter " << counter <<std::endl;
return 0;
}
I use gdb command catch system futex to check system call futex.
Then I used c to keep the program running, but the program stopped output and got stuck in a weird call and ret loop.
Thread 8 "a.out" hit Catchpoint 1 (call to syscall futex), __lll_lock_wait_private ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
95 in ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
[Switching to Thread 0x7f0db5ffb700 (LWP 8923)]
Thread 9 "a.out" hit Catchpoint 1 (call to syscall futex), __lll_lock_wait_private ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
95 in ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
[Switching to Thread 0x7f0db7fff700 (LWP 8919)]
Thread 3 "a.out" hit Catchpoint 1 (returned from syscall futex), __lll_lock_wait_private ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
95 in ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
.........
It's endless. I also noticed that a thread was stuck in write(), but pressing c couldn't get it to continue.
Id Target Id Frame
1 Thread 0x7f0dbe8ac5c0 (LWP 8915) "a.out" 0x00007f0dbe48a6dd in pthread_join (threadid=139696990869248, thread_return=0x0)
at pthread_join.c:90
2 Thread 0x7f0dbd845700 (LWP 8916) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
3 Thread 0x7f0dbd044700 (LWP 8917) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
4 Thread 0x7f0dbc843700 (LWP 8918) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
5 Thread 0x7f0db7fff700 (LWP 8919) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
6 Thread 0x7f0db77fe700 (LWP 8920) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
7 Thread 0x7f0db6ffd700 (LWP 8921) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
8 Thread 0x7f0db67fc700 (LWP 8922) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
9 Thread 0x7f0db5ffb700 (LWP 8923) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
* 10 Thread 0x7f0db57fa700 (LWP 8924) "a.out" 0x00007f0dbd92198d in write () at ../sysdeps/unix/syscall-template.S:84
11 Thread 0x7f0db4ff9700 (LWP 8925) "a.out" __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
When I comment out the output, the program works fine using c in gdb.
I don't understand why "breakpoint at futex" and write() affect each other.
Note that debugging futex locks (or any other multithreading issues) with GDB is ~impossible -- you need to have the program correct by construction.
About the only debugging you could do is understand where the program is after it deadlocks.
What did you expect? Every call to
mux.lock()andmux.unlock()may execute afutexcall, and you are doing 2 * 10 * 100'000'000 of them (there is an additional factor of 2 because GDB stops on entry and exit from system call).Of course you are going to be stuck there forever.
http://xyproblem.info seems appropriate here.