Reasoning about a program containing a data race

130 Views Asked by At

This question follows in continuation of my previous question where my understanding became that 'A data race is a property of an execution, not of the program in the abstract' which means that as long as 2 threads don't access the shared variable(with atleast one being a write access), in reality and not just theoretically, then the behaviour of the program will be well defined.

In light of the above understanding, I want to discuss the following program:

#include <iostream>
#include <thread>
#include <unistd.h>

constexpr int sleepTime = 10;

void func(int* ptr) {
    sleep(sleepTime);
    std::cout<<"going to delete ptr: "<<(uintptr_t)ptr<<"\n";
    delete ptr;
    std::cout<<"ptr has been deleted\n";
}

int main() {

    int* l_ptr = new int(5);

    std::thread t(func, l_ptr);

    t.detach();


    std::cout<<"We have passed ptr: "<<(uintptr_t)l_ptr<<" to thread for deletion. Val at ptr: "<<*l_ptr<<"\n";

    std::cin.get();

}

The above program contains a data race iff the 'main thread' and the 'child thread' happens to access the shared variable at the same time.

However, isn't it reasonable to say atleast that this is highly unlikely to happen in reality while working with multi-core CPUs.

2

There are 2 best solutions below

13
Caleth On BEST ANSWER

The above program contains a data race, full stop.

There are accesses to the same object on two threads that are unsequenced to one another. It isn't a question of "at the same time".

The standard defines a relation Happens-Before, which only barely relates to the wall-clock time when things happen.

Undefined behaviour doesn't mean "bad things are observed to happen". It means you can't reason about what you observe to happen from the rules of C++.

2
Nicol Bolas On

my understanding became that 'A data race is a property of an execution, not of the program in the abstract' which means that as long as 2 threads don't access the shared variable(with atleast one being a write access), in reality and not just theoretically, then the behaviour of the program will be well defined.

Your understanding is correct. Your application of that understanding is not:

The above program contains a data race iff the 'main thread' and the 'child thread' happens to access the shared variable at the same time.

Data races are not about accessing "the shared variable at the same time". They are about accessing memory without proper synchronization, such that there is no "happens-before" relationship between the two accesses.

There is no such relationship between the deletion of the pointer in the other thread and its use in the main thread. Therefore, this code has a data race. That data race is a property of its execution, and the execution has a data race because that execution does not contain anything that would prevent the data race from happening. It has two accesses, one of which is a write, which have no happens-before ordering between them.

Therefore, there is a data race.

A conditional data race would look somethign like this:

#include <iostream>
#include <thread>
#include <unistd.h>
#include <atomic>

std::atomic<bool> flag = false;

constexpr int sleepTime = 10;

void func(int* ptr) {
    sleep(sleepTime);
    std::cout<<"going to delete ptr: "<<(uintptr_t)ptr<<"\n";
    delete ptr;
    flag = true;
    flag.notify_all();
    std::cout<<"ptr has been deleted\n";
}

int main() {

    int* l_ptr = new int(5);

    std::thread t(func, l_ptr);

    t.detach();

    int i = 0;
    std::cin >> i;
    if(i == 5)
    {
        test.wait(false);
    }

    std::cout<<"We have passed ptr: "<<(uintptr_t)l_ptr<<" to thread for deletion. Val at ptr: "<<*l_ptr<<"\n";

    std::cin.get();

}

This will perform proper synchronization, creating a happens-before relationship, but it will only perform this if the user enters the value "5". If they enter anything else, then there is a data race.