The original issue is spread across hundreds of thousands LoC from different projects. It contains a lot of ingredients: in-line assembly, virtual inheritance, levels of indirection, different compilers and compiler options. (It's like a thriller.) I had a hard time to simplify to this SSCCE:
// a.hpp
struct A {
int i;
~A() { asm("" : "=r"(i)); }
};
struct B : public virtual A { };
struct C : public B { };
struct D {
D(C);
};
// a.cpp
#include "a.hpp"
void f(C) {
}
D::D(C c) {
f(c);
}
// main.cpp
#include "a.hpp"
int main() {
C c;
D d(c);
}
Build with these command lines:
g++ -O3 -fPIC -c a.cpp
clang++ -O3 -fPIC -c main.cpp
clang++ -fuse-ld=gold main.o a.o -o main
And the linker output is:
a.o:a.cpp:function D::D(C) [clone .cold]: error: relocation refers to global symbol "construction vtable for B-in-C", which is defined in a discarded section
section group signature: "_ZTV1C"
prevailing definition is from main.o
clang-10: error: linker command failed with exit code 1 (use -v to see invocation)
I believe there's a bug in either gcc, clang or gold. My question is where is it? (I guess it's gold but I want to be sure before reporting the bug.)
FWIW: As I said, all the ingredients are important and the issue goes away if, for instance, the asm
is removed. More notable changes that make the issue go away are:
- Use the same compiler for all TUs, (It doesn't matter whether g++ or clang++.)
- Link with ld (i.e., remove
-fuse-ld=gold
) - Compile
main.cpp
without-O3
. - Compile
main.cpp
without-fPIC
. - Swap
a.o
andmain.o
in the linker command line.
This appears to be a bug in GCC, but the inline asm, being outside the ABI, could simply render this an unfortunate incompatibility between GCC and Clang.
The problem is that the inline asm is making GCC think that
~A::A()
can raise an exception, so it creates a exception handling path inD::D()
, which requires a construction vtable for B-in-C, which it places in the COMDAT group that also contains the vtable for C (_ZV1C
).Because Clang does not generate a construction vtable in the
_ZV1C
COMDAT group, but GCC does, you end up in a situation where the linker might keep the Clang-generated COMDAT group, and discard the GCC-generated version that has the construction vtable. If you link with the GCC-generated code that expects the extra symbol definition, you get this error.Reversing main.o and a.o in your link also works around the problem, since all three linkers will then keep the COMDAT group from a.o, it being the first one seen.
Here's the code GCC generates for D::D(), from a.o:
The code from offset 0x2d through the
callq
at 0x3f is the exception handling path, generated for when an exception happens during the call off(c)
. Thelea
instruction at 0x30 is referencing an entry in the construction vtable for B-in-C (_ZTC1C0_1B
).Without that inline asm, GCC would have generated the same code as clang, with no exception handling path and no construction vtable necessary.
Compile with
--no-exceptions
, and the problem also goes away.I see the same problem whether compiling a.cpp at
-O0
,-O1
,-O2
, or-O3
.At least GCC is consistent when compiling a.cpp and main.cpp, so it could be argued that this case simply isn't covered by the C++ ABI, and GCC and Clang are free to treat it differently. I made some trivial attempts to reproduce with something other than inline asm, but could not.
As for why you're getting an error from gold, but not from bfd or lld, gold is reporting what could have been a real error, though in this particular case, since an exception could never have been thrown, the exception handling code would never have executed. But when you link with bfd ld or lld, the
lea
instruction at 0x30 is left unrelocated, with no warning, and the program could conceivably crash in the case of an exception being thrown during the call tof()
.