The following program crashes:
#include <iostream>
#include <filesystem>
namespace fs = std::filesystem;
int main()
{
fs::path p1 = "/usr/lib/sendmail.cf";
std::cout << "p1 = " << p1 << '\n';
}
Compilation:
$ g++ -std=c++17 pathExistsTest.cpp
$ ./a.out
p1 = "/usr/lib/sendmail.cf"
[1] 35688 segmentation fault (core dumped) ./a.out
Tested on Ubuntu 20.04, compiler is GCC 8.4.0.
Valgrind, here is the cut output:
==30078== by 0x4AE5034: QAbstractButton::mouseReleaseEvent(QMouseEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==30078== by 0x4A312B5: QWidget::event(QEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==30078== Address 0x2b is not stack'd, malloc'd or (recently) free'd
==30078==
==30078==
==30078== Process terminating with default action of signal 11 (SIGSEGV)
==30078== Access not within mapped region at address 0x2B
==30078== at 0x13AD9B: std::vector<std::filesystem::__cxx11::path::_Cmpt, std::allocator<std::filesystem::__cxx11::path::_Cmpt> >::~vector() (in /home/(me)/src/tomato/build-src-Desktop-Release/TomatoLauncher)
Full Output
I don't even know why the vector dtor is called? I only create a path
variable, no vector<path>
.
TL;DR
You're compiling with GCC 8.4.0, therefore you need to link explicitly against
-lstdc++fs
.Since you're using GCC 8.4.0, you're using the GNU C++ Standard Library aka libstdc++ headers for version GCC 8.4.0. But your system (Ubuntu 20.04) only contains
libstdc++.so.6.0.28
from GCC 9. If you don't explicitly link against-lstdc++fs
, then you're accidentally consuming astd::filesystem
symbol from GCC 9 (vialibstdc++.so
) instead of from GCC 8 (vialibstdc++fs.a
).GCC 8 and GCC 9 have incompatible
std::filesystem
types. More specifically, their binary layout is different. This is basically a very hidden ODR-violation. Your object is allocated for GCC 8 layout but constructed using GCC 9 layout. When you then attempt to destroy it, the destructor uses GCC 8 layout and crashes because the data is not what it expects.There are two pieces of code which use different, incompatible layouts of the
path
type.The first piece of code is from
libstdc++.so.6.0.28
: It contains a definition ofpath::_M_split_cmpts()
, called via the inline constructorpath::path(string_type&&, format)
. Since the constructor is inline, code for the constructor itself is generated into your executable. Your executable therefore contains a call topath::_M_split_cmpts
.The second piece of code is in your own executable: It generates instructions for the inline (defaulted) destructor
path::~path()
, and the inline functions it calls; all the way up tostd::filesystem::__cxx11::path::path<char [21], std::filesystem::__cxx11::path>(char const (&) [21], std::filesystem::__cxx11::path::path>(char const (&) [21], std::filesystem::__cxx11::path::format)
.How can we find this?
using a debugger: Stepping through suspicious functions in the ctor reveals:
That's a call through the PLT (so, potentially from a shared object, and definitely not inlined). We step into it and:
So, we can see that it comes indeed from
/lib/x86_64-linux-gnu/libstdc++.so.6
, which is a symlink to/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28
.The dtor we can see e.g. in the Valgrind output in the OP:
It's inline and therefore in the executable.
Now, the actually interesting part is that both the header which contains the inlined function for
path
and thepath::_M_split_cmpts
function are from the GNU C++ Standard library (libstdc++).How can they be incompatible?
To answer this, let's take a look at the exact version. We're compiling with GCC 8.4.0. It has baked in include paths, and they refer to standard library headers shipped in the gcc-8 package of Ubuntu 20.04. Those match perfectly, and you have to change default settings to make GCC consume different, unmatching standard library headers. The headers are therefore those of GCC 8.4.0.
What about the shared object
libstdc++.so
? We're running withlibstdc++.so.6.0.28
according toldd
and the debugger. According to libstdc++ ABI Policy and Guidelines, that's GCC >= 9.3.libstdc++.so.6.0.28 does contain a definition of
_ZNSt10filesystem7__cxx114path14_M_split_cmptsEv
:According to the ABI doc, this is
So that's a symbol which was NOT available in GCC 8.4.0.
Why doesn't the compiler/linker complain?
When we compile with gcc-8, why doesn't the compiler or linker complain about us consuming a symbol from GCC 9?
If we compile with
-v
, we see the linker invocation:In there, we have
-L/usr/lib/gcc/x86_64-linux-gnu/8
and other paths to find the standard library. There, we findlibstdc++.so -> ../../../x86_64-linux-gnu/libstdc++.so.6
, which finally points to/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28
(!!!).So the linker is given GCC 9's
libstdc++.so
, and it does NOT receive any version information on the symbol from the compiler (*). The compiler only knows the source code, and the source code does not contain a symbol version in this case (filesystem headers of GCC 8.4.0). The symbol version is however present in the ELF binarylibstdc++.so
. The linker seesGLIBCXX_3.4.26
for the symbol requested by the compiler_ZNSt10filesystem7__cxx114path14_M_split_cmptsEv
and is satisfied with that. Makes you wonder if there's a linker switch to tell the linker "don't consume a versioned symbol if I requested an unversioned symbol".(*) The linker does not receive any symbol information on that unresolved symbol from the compiler because the compiler has no such information from the source code. You can add info to your source code. I don't know how libstdc++ usually does it - or its policy on symbol versions in header files. It looks like it is not done at all for
filesystem
.The ELF symbol versioning mechanism should usually prevent such incompatibilities: If there is a layout-incompatible change, you create a new symbol with the same name but a different version, and add it to
libstdc++.so
, which then contains both the old and the new version.A binary compiled against
libstdc++.so
specifies which version of a symbol it wants, and the dynamic loader properly resolves the undefined symbols against symbols of matching name and version. Note that the dynamic linker does not know which shared library to search (on Windows/PE, this is different). Any "symbol request" is merely an undefined symbol, and there's a completely separate list of required libraries which shall provide those undefined symbols. But there's no mapping in the binary which symbol should come from which library.Because the ELF symbol versioning mechanism allows backwards-compatible additions of symbols, we can maintain a single
libstdc++.so
for multiple versions of the compiler. That's why you see symlinks all over the place, leading all to the same file. The suffix.6.0.28
is another, orthogonal versioning scheme which allows backwards-incompatible changes: You binary can specify that it needslibstdc++.so.6
and you can add an incompatiblelibstdc++.so.7
for other binaries.Fun fact: If you linked your library against a pure GCC 8 version of
libstdc++.so
, you would have seen a linker error. Linking against a shared library doesn't do much to the binary; it does however fix the symbol versions of unresolved symbols and can check that no unresolved symbols are left after looking though all libraries. We can see that your binary actually requests_ZNSt10filesystem7__cxx114path14_M_split_cmptsEv@GLIBCXX_3.4.26
when you link it againstlibstdc++.so.6.0.28
.Fun fact 2: If you run your library against a pure GCC 8 version of
libstdc++.so
, you would have received a dynamic linker error, because it cannot find_ZNSt10filesystem7__cxx114path14_M_split_cmptsEv@GLIBCXX_3.4.26
.What should actually happen?
You should actually link to
libstdc++fs.a
. It also provides a definition of_ZNSt10filesystem7__cxx114path14_M_split_cmptsEv
, and it's not a symlink but specific to this GCC version:/usr/lib/gcc/x86_64-linux-gnu/8/libstdc++fs.a
.When you link against
-lstdc++fs
, you get its symbols included directly into the executable (since it's a static library). Symbols in the executable take priority over the symbols in shared objects. Therefore, the_ZNSt10filesystem7__cxx114path14_M_split_cmptsEv
fromlibstdc++fs.a
is used.What's actually the incompatibility in layout in
path
?GCC 9 introduced a different type to hold the components of the path. Using
clang++ -cc1 -fdump-record-layouts
, we can see the offset at the left side, and the member and type names at the right side:GCC 8.4.0:
GCC 9.3.0:
The difference is in
path::_M_cmpts
:You can also see the structure of
path::_List
in the record dump above. It's very much not compatible to a GCC 8vector
.Remember that we're calling
path::_M_split_cmpts
via libstdc++.so from GCC 9, and we're crashing in thevector
destructor for this_M_cmpts
data member.Here's the commit that changed from
vector
to_List
: