Weird multithreading behavior - CompilerExplorer link

103 Views Asked by At

TL;DR: Why does this https://godbolt.org/z/ohK31hW34 multithreaded program segfault?

Explanation: I am encountering weird behavior of my multithreaded C++ application. The application has multiple threads that loop in a while loop guarded by an std::atomic<bool> variable. I use this construct in multiple places, so I extracted it to a simple ThreadLoop class with Start(function) & Stop() methods.

class ThreadLoop
{
public:
  ThreadLoop(const std::string& name) : mName(name) {}
  ~ThreadLoop() { Stop(); }

  template <typename F>
  void Start(F&& function)
  {
    if (mRunning)
      return;

    std::scoped_lock lock(mMutex);
    if (mThread.joinable())
      mThread.join();

    mRunning = true;
    mThread = std::thread([&]() {
      while (mRunning)
      {
        function();
        std::this_thread::sleep_for(std::chrono::milliseconds(10));
      }
    });
  }

  void Stop()
  {
    if (not mRunning)
      return;

    mRunning = false;
    std::this_thread::sleep_for(std::chrono::milliseconds(100));
  }

private:
  std::atomic<bool> mRunning = false;
  std::mutex mMutex;
  std::thread mThread;
  std::string mName;
};

I then use an object of this custom class as a member in a different 'worker' class that assigns a particular function to be periodically executed, like so

class Worker1
{
public:
  void StartWorking()
  {
    mThread.Start([this]() { Work(); });
  }

  void StopWorking() { mThread.Stop(); }

private:
  ThreadLoop mThread{"worker1 loop"};

  void Work()
  {
    fmt::print("Working...\n");
    std::this_thread::sleep_for(std::chrono::milliseconds(10));
  }
};

I have all of these 'workers' in another class and call StartWorking() / StopWorking() on them at random time points (also in a ThreadLoop)

class Main
{
public:
  void Start()
  {
    mThread.Start([this]() { MainLoop(); });
  }

  void Stop() { mThread.Stop(); }

private:
  ThreadLoop mThread{"main loop"};
  Worker1 mWorker1;

  void MainLoop()
  {
    if (/*something*/)
      mWorker1.StartWorking();
    else
      mWorker1.StopWorking();
    std::this_thread::sleep_for(std::chrono::milliseconds(10));
  }
};

The first ThreadLoop ("main loop") in the Main class starts fine and starts calling StartWorking() / StopWorking(), as expected. The StartWorking() then triggers the worker to start its own ThreadLoop ("worker1 loop"), which fails undeterministically inside the ThreadLoop::Start() function, for example via

    `../nptl/pthread_mutex_lock.c:81: __pthread_mutex_lock: Assertion mutex->__data.__owner == 0 failed`

Also, according to the debugger, the entire ThreadLoop object seems to be uninitialized / destroyed (e.g. the std::string mName variable is empty, although I always provide a non-empty string) - this probably causes the std::scoped_lock to fail - locking an uninitialized/destroyed mutex. My question is, how/why is the ThreadLoop object uninitialized? I think I clearly construct it as a member inside each Worker1 object?

1

There are 1 best solutions below

0
On

The issue was that I was capturing the T&& function by reference in the lambda inside ThreadLoop::Start(), and the thread then invoked the lambda, which was a dangling reference. The fix was to capture T&& function by value.

Original version:

mThread = std::thread([&](){ ... })

Fixed version:

mThread = std::thread([this, function](){ ... })

Even better version:

mThread = std::jthread([this, function](){ ... })