Can dlmopen be used as a "drop-in" replacement for dlopen?

109 Views Asked by At

I have a non-thread-safe shared library (C/Fortran), i.e. it uses global variables defining its state. So when I open this library multiple times from the same process using dlopen, the global variables are shared, i.e. the states are interfering (messed up) because only the reference count is increased. (My application works without problems when I dlopen two physical copies of the library).

I found the issue linked above and another issue mentioning dlmopen suggesting that I can load separate instances of the library (with their own global variables), which would eliminate the problem. dlmopen's documentation is hard to understand for me, but I gave it a shot and tried to open my library with something like this:

inline handle_type open(std::string const& filename) {
    handle_type h = dlmopen(LM_ID_NEWLM, filename.c_str(), RTLD_LAZY);
    return h;
}

But alas, I got a crash for which valgrind reported an invalid read of size 32 (at 0x2833DE35: ??? (in /usr/lib64/libc-2.28.so)) from a strange (seemingly innocent) location in my code (a = &static_function;), i.e. assigning the address of a static function to some function pointer.

So my question is: should dlmopen on one copy of the library give the same result as dlopen on two copies? Am I calling dlmopen in a wrong way? Or is there a fundamental problem with assigning addresses of static functions when I use dlmopen?

1

There are 1 best solutions below

2
Employed Russian On

So my question is: should dlmopen on one copy of the library give the same result as dlopen on two copies?

Not necessarily1.

Am I calling dlmopen in a wrong way?

I would suggest using RTLD_NOW instead of RTLD_LAZY.

If you do that, and dlmopen call does not fail (does not return NULL), then you can conclude that this library has been properly linked with all of its dependencies.

If dlmopen(..., RTLD_NOW) fails, you can conclude that using dlmopen on this library will not work.


If dlmopen should work, your next task is to figure out the cause of the crash. Unfortunately, tool support for debugging this is pretty much non-existent. For example, Valgrind is likely to be completely confused.

You might be able to make sense of the crash using GDB, but even that may require manually telling GDB where the newly-added libraries reside (see this bug). Note: the bug has been fixed, so a recent version of GDB may just work.

Or is there a fundamental problem with assigning addresses of static functions when I use dlmopen?

No.


1 There are many differences. As one example, consider what happens when you call a function in this library (say foo()), which returns memory allocated with malloc and which you are responsible for freeing. When using a single linker namespace, there is only one malloc in the picture, and everything just works.

When you dlmopen the library twice, there are three! instances of malloc (one in the default namespace, and two in the two new namespaces). If you now pass a pointer allocated by one instance of malloc to free located in another namespace, a crash or heap corruption is likely.


I was also just reminded of this answer.