Would someone please explain to me the bold parts?

I didn't get how come *__vptr which is in the Base portion of the class and dPtr has access to this pointer CAN all of a sudden point to the D1 virtual table instead of the Base virtual table! I have read some articles, watched some sources and still confused.

class Base
{
public:
    virtual void function1() {};
    virtual void function2() {};
};
 
class D1: public Base
{
public:
    virtual void function1() {};
};
 
class D2: public Base
{
public:
    virtual void function2() {};
};
int main()
{
    D1 d1;
    Base *dPtr = &d1;
 
    return 0;
}

Note that because dPtr is a base pointer, it only points to the Base portion of d1. However, also note that *__vptr is in the Base portion of the class, so dPtr has access to this pointer. Finally, note that dPtr->__vptr points to the D1 virtual table! Consequently, even though dPtr is of type Base, it still has access to D1’s virtual table (through __vptr).

Source: https://www.learncpp.com/cpp-tutorial/125-the-virtual-table/comment-page-6/#comment-484189

1

There are 1 best solutions below

0
On BEST ANSWER

To understand you need to understand that the implementation of C++ is not the definition of C++.

C++ is defined by the behavior of an abstract machine. This abstract machine's behavior is defined by the standard, and a compliant C++ compiler must compile programs to run as-if they ran on that abstract machine.

The rules of what that abstract machine does are inspired and based off of real computers and real implementations of C++ and C programs.

So when you talk about "virtual function tables", you are talking about one common implementation of what C++ does with virtual methods. That common implementation doesn't define how C++ acts, and mixing the two up can cause problems.


That being said, C++'s virtual methods where based off of doing basically the exact same thing in C. It can help to sketch out how virtual methods and inheritance would work in C++ if you re-implemented it. (this has practical use, because doing so lets you make a custom object model, and custom object models let you do certain things more efficiently that the C++ object model would).

struct Bob_vtable {
  void(*print)(Bob const*) = 0;
};

struct Bob {
  Bob_vtable const* vtable = 0;
  int x = 0;

  // glue code to dispatch to vtable:
  void print() const {
    return vtable->print(this);
  }
  
  // implementation of Bob::print:
  static void print_impl( Bob const* self ) {
    std::cout << self->x;
  }

  // vtable helpers:
  static Bob_vtable make_vtable() {
    return { &Bob::print_impl };
  }
  static Bob_vtable const* get_vtable() {
    static const Bob_vtable retval = make_vtable();
    return &retval;
  }
  Bob():vtable(get_vtable()) {}
};

here is a really simple, no inheritance, implementation of a class Bob with a single virtual method print. It roughly corresponds to:

class Bob {
public:
  int x = 0;
  virtual void print() const { std::cout << x; }
};

you can see why it is nice to have all of this glue code written for you.

When you do this:

class Alice : public Bob {
public:
  int y = 0;
  void print() const override { std::cout << x << "," << y; }
};

the "manual implementation" would look someting like:

struct Alice : Bob {
  int y = 0;

  // no print glue code needed(!)

  // implementation of Alice::print:
  static void print_impl( Bob const* bobself ) {
    Alice const* self = static_cast<Alice const*>(bobself);
    std::cout << self->x << "," << self->y;
  }

  static Bob_vtable make_vtable() {
    Bob_vtable bob_version = Bob::make_vtable();
    bob_version.print = &Alice::print_impl;
    return bob_version;
  }
  static Bob_vtable const* get_vtable() {
    static const Bob_vtable retval = make_vtable();
    return &retval;
  }

  Alice():Bob() {
    // after constructing Bob, replace the vtable with ours:
    vtable = get_vtable();
  }
};

and there you have it.

Take a look at what happens here:

Alice a;
a.print(std::cout);

now, a.print actually calls Bob::print, because Alice has no print method.

Bob.print does this:

  void print() const {
    return vtable->print(this);
  }

it grabs the vtable pointer of this object instance, and calls the print function in it.

What is the vtable pointer of an object of type Alice? Look at the Alice constructor.

First it default-constructs Bob (which sets vtable to point at Bob's vtable), but then it does this:

    vtable = get_vtable();

this call to get_vtable calls Alice::get_vtable:

    static const Bob_vtable retval = make_vtable();
    return &retval;

which in turn calls Alice::make_vtable:

    Bob_vtable bob_version = Bob::make_vtable();
    bob_version.print = &Alice::print_impl;
    return bob_version;

which first calls Bob's make_vtable, then replaces .print with the Alice::print_impl.

So Bob::print calls vtable->print(this), which is Alice::print_impl(this), which does:

    Alice const* self = static_cast<Alice const*>(bobself);
    std::cout << self->x << "," << self->y;

While this is a Bob const* at this point, it is pointing at an Alice object, so that static_cast is valid.

So we print x and y from Alice.

Now, here Alice's vtable type is Bob_vtable because she didn't add any new methods. If she added new methods, she would have an Alice_vtable that inherited from Bob_vtable, and would have to static_cast<Alice_vtable const*>(vtable) to access them.

This isn't quite exactly what does "under the hood", but it is about as logically identical as I can write "off the cuff". There are a myriad of different details, like the calling convention of the functions in the vtable is different, and the format of the vtable in memory doesn't match that, etc.


Now, in the 'manual implementation' I did use inheritance. So that isn't C; but the inheritance in the 'manual implementation' is not doing anything object oriented.

struct A {int x;}; 
struct B:A{int y;};

is just doing

struct A {
  int x;
}; 
struct B {
  A base;
  int y;
};

with a bit of syntactic glitter on top.

The "manual implementation" is nearly 1:1 on how you would implement this (and people do) in . You would move the methods out of the class, call them void Bob_print(Bob const*) instead of void Bob::print() const. And you'd use struct Alice { Bob base; int y; } instead of struct Alice:Bob{ int y; };. But the difference is almost completely syntax, not anything else.

When was originally developed, OO-based C existed, and one of C++'s first goals was to be able to write C-with-classes without having to write all of the above boilerplate.

Now, C++'s object model does not require the above implementation. In fact, relying on the above implementation can result in ill-formed programs or undefined behavior. But understanding one possible way to implement C++'s object model has some use; plus, once you know how to implement C++'s object model, you can use different object models in C++.

Note that in modern C++, I'd use a lot more templates above to remove some of the boilerplate. As a practical use, I've used similar techniques to implement augmented std::any's with duck-typed virtual methods.

The result is you can get this syntax:

auto print = poly_method<void(Self const*, std::ostream&)>{
  [](auto const*self, std::ostream& os){ os << *self; }
};
poly_any<&print> x = 7;
x->*print(std::cout);

(don't try this at home).