Search

12.5 — The virtual table

To implement virtual functions, C++ uses a special form of late binding known as the virtual table. The virtual table is a lookup table of functions used to resolve function calls in a dynamic/late binding manner. The virtual table sometimes goes by other names, such as “vtable”, “virtual function table”, “virtual method table”, or “dispatch table”.

Because knowing how the virtual table works is not necessary to use virtual functions, this section can be considered optional reading.

The virtual table is actually quite simple, though it’s a little complex to describe in words. First, every class that uses virtual functions (or is derived from a class that uses virtual functions) is given its own virtual table. This table is simply a static array that the compiler sets up at compile time. A virtual table contains one entry for each virtual function that can be called by objects of the class. Each entry in this table is simply a function pointer that points to the most-derived function accessible by that class.

Second, the compiler also adds a hidden pointer to the base class, which we will call *__vptr. *__vptr is set (automatically) when a class instance is created so that it points to the virtual table for that class. Unlike the *this pointer, which is actually a function parameter used by the compiler to resolve self-references, *__vptr is a real pointer. Consequently, it makes each class object allocated bigger by the size of one pointer. It also means that *__vptr is inherited by derived classes, which is important.

By now, you’re probably confused as to how these things all fit together, so let’s take a look at a simple example:

Because there are 3 classes here, the compiler will set up 3 virtual tables: one for Base, one for D1, and one for D2.

The compiler also adds a hidden pointer to the most base class that uses virtual functions. Although the compiler does this automatically, we’ll put it in the next example just to show where it’s added:

When a class object is created, *__vptr is set to point to the virtual table for that class. For example, when a object of type Base is created, *__vptr is set to point to the virtual table for Base. When objects of type D1 or D2 are constructed, *__vptr is set to point to the virtual table for D1 or D2 respectively.

Now, let’s talk about how these virtual tables are filled out. Because there are only two virtual functions here, each virtual table will have two entries (one for function1(), and one for function2()). Remember that when these virtual tables are filled out, each entry is filled out with the most-derived function an object of that class type can call.

The virtual table for Base objects is simple. An object of type Base can only access the members of Base. Base has no access to D1 or D2 functions. Consequently, the entry for function1 points to Base::function1(), and the entry for function2 points to Base::function2().

The virtual table for D1 is slightly more complex. An object of type D1 can access members of both D1 and Base. However, D1 has overridden function1(), making D1::function1() more derived than Base::function1(). Consequently, the entry for function1 points to D1::function1(). D1 hasn’t overridden function2(), so the entry for function2 will point to Base::function2().

The virtual table for D2 is similar to D1, except the entry for function1 points to Base::function1(), and the entry for function2 points to D2::function2().

Here’s a picture of this graphically:

Although this diagram is kind of crazy looking, it’s really quite simple: the *__vptr in each class points to the virtual table for that class. The entries in the virtual table point to the most-derived version of the function objects of that class are allowed to call.

So consider what happens when we create an object of type D1:

Because d1 is a D1 object, d1 has its *__vptr set to the D1 virtual table.

Now, let’s set a base pointer to D1:

Note that because dPtr is a base pointer, it only points to the Base portion of d1. However, also note that *__vptr is in the Base portion of the class, so dPtr has access to this pointer. Finally, note that dPtr->__vptr points to the D1 virtual table! Consequently, even though dPtr is of type Base, it still has access to D1’s virtual table (through __vptr).

So what happens when we try to call dPtr->function1()?

First, the program recognizes that function1() is a virtual function. Second, the program uses dPtr->__vptr to get to D1’s virtual table. Third, it looks up which version of function1() to call in D1’s virtual table. This has been set to D1::function1(). Therefore, dPtr->function1() resolves to D1::function1()!

Now, you might be saying, “But what if Base really pointed to a Base object instead of a D1 object. Would it still call D1::function1()?”. The answer is no.

In this case, when b is created, __vptr points to Base’s virtual table, not D1’s virtual table. Consequently, bPtr->__vptr will also be pointing to Base’s virtual table. Base’s virtual table entry for function1() points to Base::function1(). Thus, bPtr->function1() resolves to Base::function1(), which is the most-derived version of function1() that a Base object should be able to call.

By using these tables, the compiler and program are able to ensure function calls resolve to the appropriate virtual function, even if you’re only using a pointer or reference to a base class!

Calling a virtual function is slower than calling a non-virtual function for a couple of reasons: First, we have to use the *__vptr to get to the appropriate virtual table. Second, we have to index the virtual table to find the correct function to call. Only then can we call the function. As a result, we have to do 3 operations to find the function to call, as opposed to 2 operations for a normal indirect function call, or one operation for a direct function call. However, with modern computers, this added time is usually fairly insignificant.

Also as a reminder, any class that uses virtual functions has a __vptr, and thus each object of that class will be bigger by one pointer. Virtual functions are powerful, but they do have a performance cost.

12.6 -- Pure virtual functions, abstract base classes, and interface classes
Index
12.4 -- Early binding and late binding

185 comments to 12.5 — The virtual table

  • each object of that class will be bigger by one pointer. I do not quite understand the reason for this, would you be kind enough to explain again?

    • nascardriver

      Hi Aron!

      When a class uses virtual functions a virtual function table (vtable) is generated somewhere in memory. In order for the class to know which functions to use it needs to store a pointer to it's vtable.

  • At some point you mentioned that this OOP style of dPtr->function1() gets transformed behind the scenes by the compiler into a declarative call to a method like function1(dPtr) where dPtr now becomes the *this pointer accesible in the class's methods.

    But here dPtr is of type Base*, while the most derived function called here through the vtable, function1, is from D1. So naturally the this pointer passed as an argument to D1::function1() that will become the *this pointer has to be of type D1*, not Base* (as dPtr currently is in code above). This cast from Base -> D1 is from less specific to a more specific, so an explicit cast has to be performed by the compiler for this to work, right? Does compiler just cast it since it knows it will work, since it wouldn't have been able to discover the D1::function1() in the vtable if the instance wouldn't have been of "real" type D1 in the first place?

    • Alex

      Good question.

      One of two things is likely happening. It's possible that the compiler is tweaking the function call to ((D1*)dPtr)->function1() and then resolving that as normal.

      It's also possible that the compiler is resolving dPtr->function1() using the D1 virtual table and then casting dPtr to a D1* at the point where D1::function1() is called.

      Either way, dPtr is being cast to a D1* so that this will be a D1*, not a Base*.

      • Trevor29

        Hi Alex
        With no disrespect intended, I believe there may be another explanation.
        The code calling function1() may not be aware of the D1 derived class (i.e. was compiled with the base class header file and not with the D1 class header file), although it would need to be aware that function1() was a virtual function. Base::function1() may also be unaware of the D1 derived class. Therefore the *this pointer passed to any function1() will be a *Base pointer, not a *D1 pointer. D1::function1() when called will perform any necessary type casting, offset adjustment or other tweaking to convert the *Base pointer into a *D1 pointer, as it is aware of both the Base class and the D1 derived class, since it is part of the D1 class. This tweaking is performed by the code in D1::function1() and therefore is performed after the call to function1(), (or when D1::function1() is compiled).
        Does this make sense?
        What do you think?
        Cheers
        Trevor

        • Alex

          That doesn't make sense to me, as I would expect any tweaking to happen before or at the point of the function call resolution, not after.

          This area is enough outside of my areas of expertise that it's hard for me to speculate on how it's all implemented. There's some additional interesting reading that you can do here that also implies that "this" is converted from a B to a D1 prior to the virtual function call actual being invoked.

          • Trevor29

            Hi Alex.

            Thanks for the link, but it didn't really clarify how it all works. hamstergene's reply mentioned multiple entry points applying different offsets, and this does suggest that the offsets which may be required to convert pointers to the base class to pointers to a derived class are applied after the function call.

            I am thinking of a more complex example. If we have a base class (Base) and two derived classes (D1 and D2) and the main code has an array of pointers to the base class (dPtr[]), which we populate with a pointer to an instance of each class, then we can have a loop which iterates over this array. If we have a virtual function in the Base class (Base::function1()) and override this in the D1 class with D1::function1() but don't override it in the D2 class, then we can call function1(dPtr[index]) from within the loop - using dPtr[index]->function1() - where "index" is the loop variable.

            If you try implementing this with a conversion before the function call, the compiler will need to include code to test each array element to see whether the call will require a conversion from *Base to *D1 or *D2. But worse, the compiler will also need to "know" that class D2 has no D2::function1() and therefore NO conversion must be performed from *Base to *D2, as Base::function1() can't take a *D2 input. And all this is required for each place function1() is called.

            If you try implementing it with the conversion performed after the function call, the calling code within the loop becomes trivial - function1() always is passed the value of dPtr[index], which is a *Base pointer.  The called code is also simple - Base::function1() is unaffected, and D1::function1() always performs a conversion from *Base to *D1, although in most cases this conversion may be trivial. When function1() is called with the element of class D2, Base::function1() is called with the value of dPtr[] which is a *Base pointer so this will be handled normally too.

            I will try to type up a suitable code example to make this clearer.

            Thanks, Trevor

            • Trevor29

              Example for the above comment:

            • Hi Trevor!

              I read the conversation twice now but I still don't quite understand what the question is. Can you sum it up in one or two questions, maybe using the example code you just provided?

              • Trevor29

                Hi Nascar
                Essentially Andrei's question - when virtual function1() is called, how does the code under the hood resolve the problem that Base::function1() needs a "this" pointer of type *Base while D1::function1() needs a "this" pointer of type *D1? Alex believes that the calling code performs the conversion if required (if I understand him correctly), but I believe it actually makes sense that D1::function1() performs the conversion after the call.

                Just to confuse the issue further, I have realised that D1::function1() could be called from within the D1 class, in which case it would be called with a *D1 "this" pointer, and so could need two entry points. (While C++ does not as far as I know support functions with two entry points, the code generated by a C++ compiler could certainly do so, particularly if one entry point falls through into the other entry point having set up or modified a parameter.) This would be similar to function overloading in some respects, in that there would be two versions of D1::function1() with different "this" parameter types. In pure C++ code, a similar result can be achieved by having a simple D1::function1() routine that takes a *Base "this" pointer simply convert the pointer to a *D1 pointer and call the non-virtual D1::function() that other parts of the D1 class can call, with a *D1 "this" pointer.

                Or does a call to D1::function1() from within the D1 class pass a *Base "this" pointer to the function?

                Trevor

                • The pointer always points to the same object, rightfully so, because there is only one object. The type aspect of that pointer is only relevant as long as you're writing C++ code. Conversions are only required inside C++.
                  The compiler knows @D1 is of type @Base, so it allows using a @D1 pointer wherever a @Base pointer is required.
                  Under the hood, it doesn't matter what type a pointer is, as your code will be compiled to assembly, which is a typeless language, a pointer is a pointer, no matter what it points to, it doesn't have a type. This means that no conversion has to be performed at any time during the call process.

                  Each member function has exactly one entry point. The first parameter is a pointer, the type doesn't matter, because there are no types.

                  Applying this to your example code from the previous comment,
                  @Base has the vTable

                  @D1 has the vTable

                  @D2 has the vTable

                  When your loop runs it will call the function at index 0 in the vTable of the object you're calling the function on. The function index is the same for every @function1, because they're overriding a parent function (I don't know what happens with multiple inheritance). The called function receives a pointer to the object it's called on, again, no types.
                  Let's say you're calling @Base::function1 from @D1::function1

                  @Base::function1 will receive the very same pointer that @D1::function1 received when it was called (This will be a pointer to a @D1 object, unless you're intentionally passing wrong pointers). No conversions are performed, because pointers don't have types.
                  Now, imagine @Base::function1 called

                  it would call @D1::function1, because @this points to a @D1 object.
                  Run time conversions checks are only performed when you use @dynamic_cast afaik.

                  I hope this answer is at least kind of what you asked for, if it's not, let me know.

                  TL;DR: Pointers don't have types after compilation, so there aren't any conversions.

                • Trevor29

                  Hi nascardriver - thanks for the reply.
                  I was thinking of multiple inheritance as well.
                  With a single inheritance, I expect that the inherited part of the object will be placed first in memory, so the pointer to the D1 object (d1) will also point to the Base part of the object, so as you say no conversion is needed. However this would be implementation dependent and therefore not guaranteed.
                  However with multiple inheritance, the second base part can't be put in the same position as the first base part, so a pointer to that second base part will have a different value to a pointer to the whole object. The difference will be an offset and the likely value of that offset is the size of the first base part. The conversion between a pointer to the base part and a pointer to the whole object is the addition or subtraction of this offset.
                  I agree with your comment about not having any type conversions after compilation.
                  So essentially what I am saying about the multiple inheritance case is that although the "this" pointer always points to the object, it might not point to the start of the object.
                  Cheers
                  Trevor

        • I can't reply to your latest comment, because there were too many replies.
          After some testing and reading, your comment seems to be correct.

  • Daniel

    Thank you for the wonderful explanation. When I learned it in school, I had no idea what it was all about, but you cleared it all up.

  • skv

    Good article.
    I have a doubt regarding the vtable in derived class. Since object model of Derived class contains Base class part also, do we have a way out to get the vptr for the Base class from the Derived class? And if possible could you please cover the C-style to get the vptr using pointer casting from the object address?

    Thanks

    • Alex

      I'm not sure what you mean by "get the vptr" -- the vptr isn't directly accessible. Are you asking how you'd use a Base classes virtual table with a Derived object? If so, I suppose you could always slice your object via a static_cast.

      I'm not sure why you'd want to do this, though.

      • skv

        I meant to dig it down at object model, just for understanding how things work. I know that vptr is not directly accessible, but there must be some way out. And yes, I meant to access Base class methods from some pointer (not exactly from the C++ styled slicing but C styled pointer and casting). For eg. something like

        Derived d;
        int* vp = *(int**)&d;

        And using this vp to call upon the functions.

  • Omri

    "Because d1 is a D1 object, d has its *__vptr set to the D1 virtual table."
    perhaps:
    "Because d1 is a D1 object, d1 has its *__vptr set to the D1 virtual table."

  • Shikhar Chaudhary

    The most beautiful explanation u can find on the internet!

  • Bobix

    Hi Alex,
       Thanks for your wonderful explanation.It helped me a lot to get a clear picture of how the Vtable is organised. But could you please help me out the structure of Vtable if the methods in the base class are pure virtual?

    • Alex

      This is discussed in the next lesson.

      • Bobix

        Thanks Alex. One more doubt, I have seen below code in some website under the section ‘cross delegation/Delegating to a sister class’. So in the below case when i create an object of type class D, where will the function pointer for ‘function1()’ and ‘function2()’ in the ‘A’ part of the object ‘D’(vptrA in the below diagram) will point to ?Will it be pointing to the corresponding functions in the child class B and C? I am not able to get the clear understanding of the memory layout in this case. I am putting the diagram for the same with my understanding, I am not sure whether that is correct or not. Could you please help me a bit?

        Below diagram is just my understanding, I am not sure whether that is correct or not 🙁

        • Alex

          As you've noticed, virtual tables for virtual classes are _very_ complicated, involving Thunks and pointers to typeinfo and other stuff. To be honest, I've never bothered to learn how these work at the virtual table level, because it didn't seem worth the time investment -- virtual classes are rare, and it's even rarer that you need to dig down to this level of detail. So you'll have to find this information elsewhere. My apologies.

  • typo_man

    "Because d1 is a D1 object, d has it’s *__vptr set to the D1 virtual table."

    it should have been "d1 has it's"

  • manikanth

    How many vptr’s will create if 10 objects created in single class.

    as per my understanding it creates 10 vptrs but when I check through this program it gives size as 4bytes, is it right way to check or please clarify.

    • Alex

      Each object of type A will have one virtual pointer. Therefore, a1 has a virtual pointer, and a2 has a virtual pointer, etc...

      sizeof(A) will return the size of class A. sizeof(a1) will return the size of object a1 (of type A). They should both return 4, as the only data in the class is the virtual pointer.

  • HWANG SEHYUN

    One Question!

    How can virtual table store different types of functions pointers?

  • Vineet

    thanks for such a nice and clear explanation.
    was very interesting to read it.

  • DHD

    In this block of code:

    Is that meant to be "&d1" and not "&d;"?

  • Ashish Mandal

    Thanks a lot for clear and by far the best explanation i have encountered with...

  • Jingguo Yao

    Clear explanation.

  • Sudhakar

    Good Explanation. great work Alex

  • Dan

    See all those good comments, but nobody asked how this was true? At least was not explained.

    Note that because pClass is a base pointer, it only points to the Base portion of cClass. However, also note that *__vptr is in the Base portion of the class, so pClass has access to this pointer. "
    "Finally, note that pClass->__vptr points to the D1 virtual table! Consequently, even though pClass is of type Base, it still has access to D1’s virtual table"

    I thought you just mentioned pClass only can access *__vptr is in the Base portion of the class

    • Alex

      Because pClass is a Base, it can only directly access Base members (this includes __vptr). However, because pClass->__vptr points to D1, pClass can access the D1 virtual table through __vptr.

      • Vineet

        Hi Alex,

        I have added one new virtual function

        in D1. So D1 virtual table will have this new function. As per your explanation I should be able to access to function3() with pClass->__vptr. But that fails. I get compile error. What is explanation for that?
        Thanks.

        • Alex

          You shouldn't try to access the virtual table directly (nor should you ever need to). pClass->function3() should work fine. If you're getting an error, I'll need more code and the specific error you're getting to debug further.

          • Vineet

            code is:

            Thanks.

            • Alex

              Ah! I see what you're getting at. Given the above example, pClass->__vptr will point at D1's virtual table, which has Function3() in it (pointing to D1::Function3()). So in this particular case, this could resolve to a valid function call at runtime.

              But as you've noted, the compiler won't even let you compile it (since Function3() isn't accessible through Base). Just because something is in the virtual table doesn't mean the compiler will let you use it.

              Remember, the virtual table is a structure used at runtime to resolve function calls. But it doesn't control access -- the compiler handles whether you should or should not be able to call a function.

  • Awesome article..only one time to read is sufficient….no more to read other article..thanks

  • sagar

    good job..!!! You are doing some really great work.Keep it up bro...!!!!God will help you out..!!!

  • Ganesh Salvi

    This is BEST explanation I found. Bookmarked this link!!! Super job and thank you!

  • kot

    Great explanation! Thank you.

  • AbdullahEsam

    can you please explain what this sentence means: "most-derived function"

    • Alex

      It means the override function that exists in the most-derived class between the base class and the class being instantiated.

      E.g. if you have class Base, and D1 inherited from Base, and D2 inherited from D1, then D2 is more derived than D1, which is more derived than Base.

  • puppi

    how can i get these virtual function's point addresses? is there any way to get it?

  • patrick

    Very clear and concise explanation

  • Basim Alamuddin

    A clear and easy explanation, thanks.

  • Sind

    Great article. Very very clear.

  • Mukesh Modi

    Only one time to read is sufficient.... no more to read other article....awesome article..thanks

  • atuldabral

    Thanks . First time i got to know the how vp works.

  • Gane

    Well, Nice info about the VTABLE appreciated. Thanks, Please keep this going...

    Well documented with examples and illustrations.

Leave a Comment

Put all code inside code tags: [code]your code here[/code]