memcpy [or not?] and multithreading [std::thread from c++11]

3.1k Views Asked by At

I writing a software in C/C++ using a lot BIAS/Profil, an interval algebra library. In my algorithm I have a master which divide a domain and feeds parts of it to slave process(es). Those return an int statute about those domain parts. There is common data for reading and that's it.

I need to parallelize my code, however as soon as 2 slave-threads are running (or more I guess) and are both calling functions of this library, it segfaults. What is peculiar about those segfaults, is that gdb rarely indicates the same error line from two builds: it depends on the speed of the threads, if one started earlier, etc. I've tried having the threads yield until a go-ahead from the master, it 'stabilize' the error. I'm fairly sure that it comes from the calls to memcpy of the library (following the gdb backtrace, I always end-up on a BIAS/Profil function calling a memcpy. To be fair, almost all functions call a memcpy to a temporary object before returning the result...). From what I read on the web, it would appear that memcpy() could be not thread-safe, depending on the implementations (especially here). (It seems weird for a function supposed to only read the shared data... or maybe when writing the thread-wise data both threads go for the same memory space?)

To try to address this, I'd like to 'replace' (at least for tests if behavior changes) the call to memcpy for a mutex-framed call. (something like mtx.lock();mempcy(...);mtx.unlock();)

1st question: I'm not a dev/code engineer at all, and lack of lot of base knowledge. I think that as I use a pre-built BIAS/Profil library, the memcpy called is the one of the system the library was built on, correct? If so, would it change anything were I to try building the library from source on my system? (I'm not sure I can build this library hence the question.)

2nd question: in my string.h, memcpy is declared by: #ifndef __HAVE_ARCH_MEMCPY extern void * memcpy(void *,const void *,__kernel_size_t); #endif and in some other string headers (string_64.h, string_32.h) a definition of the form: #define memcpy(dst, src, len) __inline_memcpy((dst), (src), (len)) or some more explicit definition, or just a declaration like the one quoted. It's starting to get ugly but, ideally, I'd like to create a pre-processor variable #define __HAVE_ARCH_MEMCPY 1, and a void * memcpy(void *,const void *,__kernel_size_t) which would do the mutex-framed memcpy with the the dismissed memcpy. The idea here is to avoid messing with the library and make it work with 3 lines of code ;)

Any better idea? (it would make my day...)

3

There are 3 best solutions below

4
On BEST ANSWER

Given that your observations, and that the Profil lib is from the last millennium, and that the documentation (homepage and Profil2.ps) do not even contain the word "thread", I would assume that the lib is not thread safe.

1st: No, usually memcpy is part of libc which is dynamically linked (at least nowadays). On linux, check with ldd NAMEOFBINARY, which should give a line with something like libc.so.6 => /lib/i386-linux-gnu/libc.so.6 or similar. If not: rebuild. If yes: rebuilding could help anyway, as there are many other factors.

Besides this, I think memcpy is thread safe as long as long as you do never write back data (even writing back unmodified data will hurt: https://blogs.oracle.com/dave/entry/memcpy_concurrency_curiosities).

2nd: If it turns out that you have to use a modified memcpy, also think about LD_PRELOAD.

3
On

In general, you must use a critical section, mutex, or some other protection technique to keep multiple threads from accessing non thread safe (non- re-entrant) functions simultaneously. Some ANSI C implementations of memcpy() are not thread safe, some are. ( safe, not safe )

Writing functions that are thread-safe, and/or writing threaded programs that can safely accommodate non thread safe functions is a substantial topic. Very doable, but requires reading up on the topic. There is much written. This, will at least help you to start asking the right questions.

1
On

IMHO you shouldn't concentrate to the memcpy()s, but to the higher level funktionality.

And memcpy() is thread-safe if the handled memory intervals of the parallel running threads don't overlap. Practically, in the memcpy() is there only a for(;;) loop (with a lot of optimizations) [at least in glibc], it is the cause, why is it in declared.

If you want to know, what your parallel memcpy()-ing threads will do, you should imagine the for(;;) loops which copy memory through longint-pointers.