C++ vtable [TBC..]
Nimrod’s post is the best blog I found online that explains virtual table. This post tries my best to answer a few questions.
- What is inside vtable?
- How does dynamic dispatch work?
- How does
dynamic_cast
works?
1
2
3
4
5
6
7
8
struct A { int x; virtual A* get_instance(){return this;} };
struct B { char y; };
struct C : public B, virtual public A { double z; virtual C* get_instance(){return this;} };
int main() {
A a = C();
return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
$ clang++ -Xclang -fdump-record-layouts -emit-llvm -c ~/tmp/test.cpp -std=c++17
*** Dumping AST Record Layout
0 | struct B
0 | char y
| [sizeof=1, dsize=1, align=1,
| nvsize=1, nvalign=1]
*** Dumping AST Record Layout
0 | struct A
0 | (A vtable pointer)
8 | int x
| [sizeof=16, dsize=12, align=8,
| nvsize=12, nvalign=8]
*** Dumping AST Record Layout
0 | struct C
0 | (C vtable pointer)
8 | struct B (base)
8 | char y
16 | double z
24 | struct A (virtual base)
24 | (A vtable pointer)
32 | int x
| [sizeof=40, dsize=36, align=8,
| nvsize=24, nvalign=8]
$ clang++ -Xclang -fdump-vtable-layouts -emit-llvm -c ~/tmp/test.cpp -std=c++17
Original map
Vtable for 'C' (8 entries).
0 | vbase_offset (24)
1 | offset_to_top (0)
2 | C RTTI
-- (C, 0) vtable address --
3 | C *C::get_instance()
4 | vcall_offset (-24)
5 | offset_to_top (-24)
6 | C RTTI
-- (A, 24) vtable address --
7 | C *C::get_instance()
[return adjustment: 0 non-virtual, -24 vbase offset offset] method: A *A::get_instance()
[this adjustment: 0 non-virtual, -24 vcall offset offset] method: A *A::get_instance()
Virtual base offset offsets for 'C' (1 entry).
A | -24
Thunks for 'C *C::get_instance()' (1 entry).
0 | return adjustment: 0 non-virtual, -24 vbase offset offset
this adjustment: 0 non-virtual, -24 vcall offset offset
VTable indices for 'C' (1 entries).
0 | C *C::get_instance()
Original map
Vtable for 'A' (3 entries).
0 | offset_to_top (0)
1 | A RTTI
-- (A, 0) vtable address --
2 | A *A::get_instance()
VTable indices for 'A' (1 entries).
0 | A *A::get_instance()
Thunk
The word thunk
sounds so strange when I first read it in LLVM source code. Then Chatgpt told me
The term “thunk” was first used in the 1960s in the BLISS programming language (developed at Carnegie Mellon University) and early compiler research. The name was chosen as a joke—an irregular past participle of “think,” meaning “something that has been thought of.” The original idea was that a thunk is something a compiler generates automatically to think for the programmer in handling certain complexities.
This story is not mentioned in wikipedia, but the idea is similar. A thunk is a subroutine used to inject a calculation into another subroutine. Anyway, this word is popular in compiler world!
Why we need thunk in vtable?
- Virtual inheritance
- Adjusting the this Pointer in Multiple Inheritance
- Covariant return types
1
2
3
struct A { virtual void foo() { cout << "A" << endl; } };
struct B{ virtual void foo() { cout << "B" << endl; } };
struct C : public B, public A { virtual void foo() { cout << "C" << endl; } };
It compiles
Insert here
1
2
std::cout << Method->getParent()->getName().str() << ":" << Method->getName().str()
<< "|" << Overridden->getParent()->getName().str() << ":" << Overridden->getName().str() << std::endl;
output
1
2
C:foo|B:foo
C:foo|A:foo
It is a map from the derived method to the list of base/overridden methods.
Record Layout
The main file is RecordLayoutBuilder.cpp
static_cast
Four types of casts. Most logic resides in SemaCast.cpp
.
- downcast: cast base to derived
- upcast: cast derived to base (actually no need to cast at all)
Rule 1: Cannot cast ‘Base _’ to ‘Derived _’ via virtual base
See below example.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
struct Base { virtual ~Base() {} };
// Virtual inheritance
struct Middle : virtual public Base {
int m;
};
struct Derived : public Middle {
int d;
};
int main() {
Derived d;
Base* basePtr = &d; // Upcasting works fine
Derived* derivedPtr = static_cast<Derived*>(basePtr); // COMPILATION ERROR
}
expr.static.cast#11 talks about this rule. The corresponding clang implementation is here. With virtual inheritance, the offset of the virtual base subobject within the complete object is not known at compile time - it depends on the exact runtime type of the object.