Linking error: construction vtable defined in a discarded section

577 Views Asked by At

The original issue is spread across hundreds of thousands LoC from different projects. It contains a lot of ingredients: in-line assembly, virtual inheritance, levels of indirection, different compilers and compiler options. (It's like a thriller.) I had a hard time to simplify to this SSCCE:

// a.hpp
struct A {
    int i;
    ~A() { asm("" : "=r"(i)); }
};

struct B : public virtual A { };

struct C : public B { };

struct D {
    D(C);
};

// a.cpp
#include "a.hpp"

void f(C) {
}

D::D(C c) {
    f(c);
}

// main.cpp
#include "a.hpp"

int main() {
    C c;
    D d(c);
}

Build with these command lines:

g++ -O3 -fPIC -c a.cpp
clang++ -O3 -fPIC -c main.cpp
clang++ -fuse-ld=gold main.o a.o -o main

And the linker output is:

a.o:a.cpp:function D::D(C) [clone .cold]: error: relocation refers to global symbol "construction vtable for B-in-C", which is defined in a discarded section
  section group signature: "_ZTV1C"
  prevailing definition is from main.o
clang-10: error: linker command failed with exit code 1 (use -v to see invocation)

I believe there's a bug in either gcc, clang or gold. My question is where is it? (I guess it's gold but I want to be sure before reporting the bug.)

FWIW: As I said, all the ingredients are important and the issue goes away if, for instance, the asm is removed. More notable changes that make the issue go away are:

  1. Use the same compiler for all TUs, (It doesn't matter whether g++ or clang++.)
  2. Link with ld (i.e., remove -fuse-ld=gold)
  3. Compile main.cpp without -O3.
  4. Compile main.cpp without -fPIC.
  5. Swap a.o and main.o in the linker command line.
1

There are 1 best solutions below

0
On

This appears to be a bug in GCC, but the inline asm, being outside the ABI, could simply render this an unfortunate incompatibility between GCC and Clang.

The problem is that the inline asm is making GCC think that ~A::A() can raise an exception, so it creates a exception handling path in D::D(), which requires a construction vtable for B-in-C, which it places in the COMDAT group that also contains the vtable for C (_ZV1C).

Because Clang does not generate a construction vtable in the _ZV1C COMDAT group, but GCC does, you end up in a situation where the linker might keep the Clang-generated COMDAT group, and discard the GCC-generated version that has the construction vtable. If you link with the GCC-generated code that expects the extra symbol definition, you get this error.

Reversing main.o and a.o in your link also works around the problem, since all three linkers will then keep the COMDAT group from a.o, it being the first one seen.

Here's the code GCC generates for D::D(), from a.o:

0000000000000002 <_ZN1DC1E1C>:
   2:   48 83 ec 18             sub    $0x18,%rsp
   6:   48 8b 06                mov    (%rsi),%rax
   9:   48 8b 40 e8             mov    -0x18(%rax),%rax
   d:   8b 04 06                mov    (%rsi,%rax,1),%eax
  10:   89 44 24 08             mov    %eax,0x8(%rsp)
  14:   48 8b 05 00 00 00 00    mov    0x0(%rip),%rax
            17: R_X86_64_REX_GOTPCRELX  _ZTV1C-0x4
  1b:   48 8d 40 18             lea    0x18(%rax),%rax
  1f:   48 89 04 24             mov    %rax,(%rsp)
  23:   48 89 e7                mov    %rsp,%rdi
  26:   e8 00 00 00 00          callq  2b <_ZN1DC1E1C+0x29>
            27: R_X86_64_PLT32  _Z1f1C-0x4
  2b:   eb 17                   jmp    44 <_ZN1DC1E1C+0x42>
  2d:   48 89 c7                mov    %rax,%rdi
  30:   48 8d 05 00 00 00 00    lea    0x0(%rip),%rax
            33: R_X86_64_PC32   _ZTC1C0_1B+0x14
  37:   48 89 04 24             mov    %rax,(%rsp)
  3b:   89 44 24 08             mov    %eax,0x8(%rsp)
  3f:   e8 00 00 00 00          callq  44
            40: R_X86_64_PLT32  _Unwind_Resume-0x4
  44:   48 83 c4 18             add    $0x18,%rsp
  48:   c3                      retq   

The code from offset 0x2d through the callq at 0x3f is the exception handling path, generated for when an exception happens during the call of f(c). The lea instruction at 0x30 is referencing an entry in the construction vtable for B-in-C (_ZTC1C0_1B).

Without that inline asm, GCC would have generated the same code as clang, with no exception handling path and no construction vtable necessary.

Compile with --no-exceptions, and the problem also goes away.

I see the same problem whether compiling a.cpp at -O0, -O1, -O2, or -O3.

At least GCC is consistent when compiling a.cpp and main.cpp, so it could be argued that this case simply isn't covered by the C++ ABI, and GCC and Clang are free to treat it differently. I made some trivial attempts to reproduce with something other than inline asm, but could not.

As for why you're getting an error from gold, but not from bfd or lld, gold is reporting what could have been a real error, though in this particular case, since an exception could never have been thrown, the exception handling code would never have executed. But when you link with bfd ld or lld, the lea instruction at 0x30 is left unrelocated, with no warning, and the program could conceivably crash in the case of an exception being thrown during the call to f().