Post

C++ vtable [TBC..]

Nimrod’s post is the best blog I found online that explains virtual table. This post tries my best to answer a few questions.

  1. What is inside vtable?
  2. How does dynamic dispatch work?
  3. How does dynamic_cast works?
1
2
3
4
5
6
7
8
struct A { int x; virtual A* get_instance(){return this;}  };
struct B { char y; };
struct C : public B, virtual public A { double z; virtual C* get_instance(){return this;} };

int main() {
  A a = C();
  return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
$ clang++ -Xclang -fdump-record-layouts -emit-llvm -c ~/tmp/test.cpp -std=c++17

*** Dumping AST Record Layout
         0 | struct B
         0 |   char y
           | [sizeof=1, dsize=1, align=1,
           |  nvsize=1, nvalign=1]

*** Dumping AST Record Layout
         0 | struct A
         0 |   (A vtable pointer)
         8 |   int x
           | [sizeof=16, dsize=12, align=8,
           |  nvsize=12, nvalign=8]

*** Dumping AST Record Layout
         0 | struct C
         0 |   (C vtable pointer)
         8 |   struct B (base)
         8 |     char y
        16 |   double z
        24 |   struct A (virtual base)
        24 |     (A vtable pointer)
        32 |     int x
           | [sizeof=40, dsize=36, align=8,
           |  nvsize=24, nvalign=8]

$ clang++ -Xclang -fdump-vtable-layouts -emit-llvm -c ~/tmp/test.cpp -std=c++17

Original map
Vtable for 'C' (8 entries).
   0 | vbase_offset (24)
   1 | offset_to_top (0)
   2 | C RTTI
       -- (C, 0) vtable address --
   3 | C *C::get_instance()
   4 | vcall_offset (-24)
   5 | offset_to_top (-24)
   6 | C RTTI
       -- (A, 24) vtable address --
   7 | C *C::get_instance()
       [return adjustment: 0 non-virtual, -24 vbase offset offset] method: A *A::get_instance()
       [this adjustment: 0 non-virtual, -24 vcall offset offset] method: A *A::get_instance()

Virtual base offset offsets for 'C' (1 entry).
   A | -24

Thunks for 'C *C::get_instance()' (1 entry).
   0 | return adjustment: 0 non-virtual, -24 vbase offset offset
       this adjustment: 0 non-virtual, -24 vcall offset offset

VTable indices for 'C' (1 entries).
   0 | C *C::get_instance()

Original map
Vtable for 'A' (3 entries).
   0 | offset_to_top (0)
   1 | A RTTI
       -- (A, 0) vtable address --
   2 | A *A::get_instance()

VTable indices for 'A' (1 entries).
   0 | A *A::get_instance()

Thunk

The word thunk sounds so strange when I first read it in LLVM source code. Then Chatgpt told me

The term “thunk” was first used in the 1960s in the BLISS programming language (developed at Carnegie Mellon University) and early compiler research. The name was chosen as a joke—an irregular past participle of “think,” meaning “something that has been thought of.” The original idea was that a thunk is something a compiler generates automatically to think for the programmer in handling certain complexities.

This story is not mentioned in wikipedia, but the idea is similar. A thunk is a subroutine used to inject a calculation into another subroutine. Anyway, this word is popular in compiler world!

Why we need thunk in vtable?

  • Virtual inheritance
  • Adjusting the this Pointer in Multiple Inheritance
  • Covariant return types
1
2
3
struct A { virtual void  foo() { cout << "A" << endl; } };
struct B{ virtual void  foo() { cout << "B" << endl; } };
struct C : public B, public A { virtual void  foo() { cout << "C" << endl; } };

It compiles

Insert here

1
2
std::cout << Method->getParent()->getName().str() << ":" << Method->getName().str()
  << "|" << Overridden->getParent()->getName().str() << ":"  << Overridden->getName().str() << std::endl;

output

1
2
C:foo|B:foo
C:foo|A:foo

It is a map from the derived method to the list of base/overridden methods.

alignment

new size.

Record Layout

The main file is RecordLayoutBuilder.cpp

This post is licensed under CC BY 4.0 by the author.