Search

7.10 — std::vector capacity and stack behavior

In lesson 6.16 -- An introduction to std::vector, we introduced std::vector and talked about how std::vector can be used as a dynamic array that both remembers its length and can be dynamically resized as required.

Although this is the most useful and commonly used part of std::vector, std::vector has some additional attributes and capabilities that make it useful in some other capacities as well.

Length vs capacity

Consider the following example:

We would say that this array has a length of 10, even though we’re only using 5 of the elements that we allocated.

However, what if we only wanted to iterate over the elements we’ve initialized, reserving the unused ones for future expansion? In that case, we’d need to separately track how many elements were “used” from how many elements were allocated. Unlike a built-in array or a std::array, which only remembers its length, std::vector contains two separate attributes: length and capacity. In the context of a std::vector, length is how many elements are being used in the array, whereas capacity is how many elements were allocated in memory.

Taking a look at an example from the previous lesson on std::vector:

The length is: 5
0 1 2 0 0

In the above example, we’ve used the resize() function to set the vector’s length to 5. This tells variable array that we’re intending to use the first 5 elements of the array, so it should consider those in active use. However, that leaves an interesting question: what is the capacity of this array?

We can ask the std::vector what it’s capacity is via the capacity() function:

On the authors machine, this printed:

The length is: 5
The capacity is: 5

In this case, the resize() function caused the std::vector to change both its length and capacity. Note that the capacity is guaranteed to be at least as large as the array length (but could be larger), otherwise accessing the elements at the end of the array would be outside of the allocated memory!

More length vs. capacity

Why differentiate between length and capacity? std::vector will reallocate its memory if needed, but like Melville’s Bartleby, it would prefer not to, because resizing an array is computationally expensive. Consider the following:

This produces the following:

length: 5  capacity: 5
length: 3  capacity: 5

Note that although we assigned a smaller array to our vector, it did not reallocate its memory (the capacity is still 5). It simply changed its length, so it knows that only the first 3 elements are valid at this time.

Array subscripts and at() are based on length, not capacity

The range for the subscript operator ([]) and at() function is based on the the vector’s length, not the capacity. Consider the array in the previous example, which has length 3 and capacity 5. What happens if we try to access the array element with index 4? The answer is that it fails, since 4 is greater than the length of the array.

Note that a vector will not resize itself based on a call to the subscript operator or at() function!

Stack behavior with std::vector

If the subscript operator and at() function are based on the array length, and the capacity is always at least as large as the array length, why even worry about the capacity at all? Although std::vector can be used as a dynamic array, it can also be used as a stack. To do this, we can use 3 functions that match our key stack operations:

  • push_back() pushes an element on the stack.
  • back() returns the value of the top element on the stack.
  • pop_back() pops an element off the stack.

This prints:

(cap 0 length 0)
5 (cap 1 length 1)
5 3 (cap 2 length 2)
5 3 2 (cap 3 length 3)
top: 2
5 3 (cap 3 length 2)
5 (cap 3 length 1)
(cap 3 length 0)

Unlike array subscripts or at(), the stack-based functions will resize the std::vector if necessary. In the example above, the vector gets resized 3 times (from a capacity of 0 to 1, 1 to 2, and 2 to 3).

Because resizing the vector is expensive, we can tell the vector to allocate a certain amount of capacity up front using the reserve() function:

This program prints:

(cap 5 length 0)
5 (cap 5 length 1)
5 3 (cap 5 length 2)
5 3 2 (cap 5 length 3)
top: 2
5 3 (cap 5 length 2)
5 (cap 5 length 1)
(cap 5 length 0)

We can see that the capacity was preset to 5 and didn’t change over the lifetime of the program.

7.11 -- Recursion
Index
7.9 -- The stack and the heap

60 comments to 7.10 — std::vector capacity and stack behavior

  • Dohn jose

    Hi Alex,

    I am getting an unexpected output from the below test case

    My expectation was array and vector size(memory used) should be same, but it seems different.
    why this behavior, can you please tell me in perspective of memory allocation happened for both array and vector

    • Alex

      sizeof() only returns the size of statically allocated memory. std::array uses all statically allocated memory, and has no overhead costs, so the sizeof a std::array is exactly what you’d expect. std::vector is different. First, std::vector uses dynamically allocated memory to store the elements, which is not considered when using sizeof(). Second, std::vector has overhead costs (a pointer to point at the dynamic memory, an integer to hold the size of the array, and apparently a few other things that add up to 20 bytes). Taking the sizeof a std::vector will likely report 20 on your machine regardless of how many elements are in the vector and regardless of the type of those elements.

  • Rohit

    Just like in python we have list to store values and we can mutate those, do we have anything like that in c++, in which we can store values and mutate them no matter what their index is, such as remove a particular index value or change some particular elements type?

    One more question - How to find the number of digits in an integer without creating a loop? Can we cast it to some other type and find its length? Please tell.

    • Alex

      std::vector will let you add/remove elements, though std::list might be better depending if you are adding/removing from the head or tail. In C++, there’s no easy way to mutate types -- C++’s strong type system discourages this.

      If you don’t want to use a loop to figure out how many digits are in an integer, you could convert (not cast) an integer to a string and then get the string length. That’s probably less performant than just using a loop.

      • Rohit

        how to convert an int to a string? The best one?
        I found this on google but this isn’t working.
        #include<iostream>
        #include<string>
        using namespace std;
        int main() {
            int n = 77;
            string s = to_string(n);
            cout<<s;
            return 0;
        }

      • Rohit

        In which section std::list topic is discussed?
        one more quesition:
        Suppose I have an array or a vector [1,2,3,4,5,6,7] and I need to remove 4 from it. How will I do that?and how to i find the index of a particular element let’s say 5 without using a loop?

        • Alex

          I haven’t written a tutorial on std::list yet.

          To remove an element from a vector, you’d typically use an iterator to find the value to erase, then call the erase() function. See vector::erase() for more information. It has an example of doing something similar.

  • Milos

    Alex man, if i have vector of type int and its length is 3 but capacity is 5, what is his size in bytes? 12 bytes or 20 bytes?

  • The Long

    Hi, Alex.
    I used for-each loop to print out the stack along with its length and capacity. Every thing went well until the last resize: the cap is always 1 value higher than length while they are supposed to be equal. what is wrong ?Thanks in advance.

    #include "stdafx.h"
    #include <vector>    

    void print_stack(const std::vector <int> &stack)        
    {
        for (auto number:stack) printf("%d ", number);
        printf("\ncap %d length %d\n", stack.capacity(), stack.size());
    }

    void handle_stack(std::vector <int> &stack)            
    {
        print_stack(stack);                                            
        for (int i=0; i<5; i++)
        {
            stack.push_back(5-i);                        
            print_stack(stack);                            
        }
        printf("\n");
        for (int i=0; i<3; i++)
        {
            stack.pop_back();                            
            print_stack(stack);                        
        }
    }

    int main()
    {                                
        std::vector <int> stack;                        
        handle_stack(stack);
        return 0;
    }
    this prints out
    cap 0 length 0
    5
    cap 1 length 1
    5 4
    cap 2 length 2
    5 4 3
    cap 3 length 3
    5 4 3 2
    cap 4 length 4
    5 4 3 2 1
    cap 6 length 5

    5 4 3 2
    cap 6 length 4
    5 4 3
    cap 6 length 3
    5 4
    cap 6 length 2

    • Alex

      I don’t see anything wrong here. When you perform an action that causes the vector to resize, the vector is free to pick a larger capacity than needed, for efficiency reasons.

      You can see this in your example. When your array is capacity 4, you add one element, and the capacity jumps to 6. This is the vector assuming you may want to add more elements later, and preallocating extra space now so it doesn’t have to do another resize if you add one more element.

  • SimonJoel

    Hello Alex, I write a program that
    1. takes an input from user and
    2.create a vector with value from 2 to that value,
    3.delete any member that is divided by 2
    4, print out the vector
    but the 3 part always throw an error, I am on a windows mechine, the error code is : Process returned -1073741819<0xc0000005>
    here is the code:

    is that because I try to delete value outside the vector? I tried to use ma-2, ma-3, on the for loop, by it still throw the same error.
    please help thank you.

    • Alex

      Using vector::erase() literally removes the element from the vector, which shifts all your elements beyond that point up by 1. So if your vector was originally 10 elements, after you erase 1, it is now 9 elements, then 8, then 7, etc…

      Your second for loop is trying to erase elements that are no longer inside the vector because the vector has shrunk due to previous erases.

  • Aman

    Hi Alex,
    why the capacity of the vector does not grow linearly as in the example here, where you are pushing elements 5,3,2 on the stack my compiler shows capacity growing as 0-1,1-2,2-4 ,why this is done in c++?

    • Alex

      It’s more performant. When you expand the capacity of a vector, it assumes you may want to add more than one element, but it doesn’t know how many, so it has to make an assumption. The more elements you already have in your vector, the more elements it assumes you might want to add. So rather than a linear increment, it takes an exponential approach.

      Note that the strategy used by std::vector in terms of growth isn’t defined by the C++ standard and may vary per compiler.

      • Elithrion

        To add a little, that expansion behavior guarantees amortized O(1) time for addition, so most of the time, you really don’t need to worry about reserving additional space manually.

        And if you do it poorly, you could in fact reduce performance. E.g.

        Would likely lead to worse performance than not having the reserve line if the function is called more than a few times on a single vector.

        (All this assuming the compiler actually has a competent expansion behavior.)

  • 123sak

    You forgot to add #include <iostream> on More length vs. capacity and you are using std::cout.

  • zukkus

    tbh I found using ‘array’ as the variable name of a vector a little confusing, considering its such a fundamental data structure and std class.  I recommend making all variable ‘myBankAccount’ or ‘studentGrades’ or something.  Seeing std::vector array sent me out to google for 5 minutes, i could have swore array was a keyword in java and c++, obv not.

    • Alex

      Yeah, it’s a common assumption that array is a keyword in C++ -- it’s not, because C++ only supports arrays through the standard library (std::array), not as a native part of the language.

      Same with stack -- std::stack is a part of the standard library.

  • Anddo

    Hey Alex,

    Nor sure of this, maybe I’m just assuming but I did search and got a bit confused. std::vector or std::array are using dynamically allocated memory by default ? So I don’t need to do, for example, this

    Whatever the answer, may I ask, if I did do this then how would I call the elements either assigning values (many at once) or changing a value or printing a value ?

    Lastly, as I knew (a bit early I guess) that std::vector and std::array are classes so your answer will clear things up for me to know how to dynamically allocate memory for classes and when should I do it. But for standard library classes (or any library) how do I know if I should use the code above and dynamically allocate memory or not ? Anyway other than a reference or documentation ?. Of course this question depends on the answer of the above.

    Sorry for the long questions and very great tutorials.

    • Alex

      std::vector does dynamic memory allocation internally, std::array does not.

      You can allocate a std::vector dynamically if you want (as per your example above), but more generally these container classes are allocated on the stack. This is analogous to how you can allocate a single integer dynamically if you want, but you generally wouldn’t.

      You access the elements inside these classes via operator[] or the at() function.

      So, short answer, allocate these on the stack unless there’s a specific reason not to.

      • Anddo

        Roger that. Since Vector does dynamic memory allocation internally, I might NOT to need to do as the example I mentioned. But since array does NOT then why wouldn’t I ? I mean, when exactly do I actually use dynamic allocation ?. Also, operator[] doesn’t work.

        The error was

        error C2679: binary ‘<<‘: no operator found which takes a right-hand operand of type ‘std::vector<int,std::allocator<_Ty>>’ (or there is no acceptable conversion)

        • Alex

          Oh yes, sorry, I got focused on vector and forgot about array. If you’re allocating a small array, you can do it on the stack. But larger arrays should be allocated dynamically so you don’t eat up all your stack space. How large is “large”? That’s a judgment call, and depends on how much memory your program needs for other things.

          Operator[] will work, but it needs an object to work on (not a pointer to an object). So just like you called (*ararptr).at(2), you’d need to do this: (*ararptr)[2].

          Also, just a reminder, instead of (*ararptr).at(2), you can do this: ararptr->at(2). The -> does an implicit dereference.

  • Rob G.

    Is push/popping the tail-end "LIFO" or is that not an appropriate description?

    • Alex

      Yes, as long as you are pushing and popping from the same end, you’ll get LIFO (last in first out) behavior. Doesn’t matter if it’s tail end or head end.

  • Rob G.

    Hi Alex was reviewing older material. From what I understand a vector can behave like a stack or an array. How do you load the stack with numbers versus loading vector array with numbers? Don’t thee contents appear they same?

  • Nyap

    I’m finding it hard to concentrate now that I know what the next chapter is about ._.

  • J3ANP3T3R

    The way i understand size and capacity is like a barrel of fuel or a drum of water.

    the drum capacity is for example 20 liters anything above that would just overflow.

    size here would then be the current contents of the drum for example 4 liters.

    im saying this because if i somehow expand the drum to accommodate more water it would increase the capacity of the drum. not its content.

    so i was kind of expecting that if we re-size the vector to a higher capacity it should not increase its size. C++ seems to behave differently as resizing 0 1 2 to 0 1 2 0 0 would add those last two elements as a used elements ( active elements ) which to me would not make any sense because then if a vector is created with a capacity of 5 elements and is initialized with 0 1 2 then the size of that vector should be 5 as well even before we call resize().

    i love linking programming logic to everyday life logic because it helps me learn faster when they make sense in real life. but for the sake of simplicity i will just remember that after a resize() is requested all the elements in the vector array will be considered used and thus every after resize() its constant that size = capacity. am i correct ?

    similar to this i would have thought that 10 is the capacity but the size is 5 because 5 is physically in use. but then the size is 10 and naturally the capacity is also 10 … not making much of a distinct difference. but hey if C++ says so it will be so. 🙂

    • Alex

      Increasing an array’s capacity should not increase its size -- however, increasing its size should increase its capacity.

      Is there somewhere I said or implied that increasing capacity would increase size as well? Like with your drum, if you make the drum bigger, the amount of contents inside shouldn’t change (unless the size of the drum is changing in response to you adding more contents).

      • J3ANP3T3R

        oh no no not from you i meant C++ and i think i meant size although i keep saying capacity because to me increasing the size of the array vector did not make any sense give that "size is how many elements are being used in the array, whereas capacity is how many elements were allocated." and so i got the two twisted in my head.

        why tell the computer that in 0 1 2 … 0 1 2 0 0 is in use ( even though did not use the last 2 elements). however, increasing the capacity would make sense. tell the computer that in 0 1 2 i am planning on adding more elements to the array so please increase the maximum capacity to say 7 more elements.

        instead we are given resize() which increases the number of elements in use (size) thus also increasing the capacity. so if i have a reason to resize 0 1 2 then it would then be 0 1 2 0 0 then i wanted to get the actualy size of the array it would then give 5 and i would be expecting 3 because i wasn’t using the last two elements yet.

        however if we are given the option to only increase the capacity of the array then if we have 0 1 2 and we plan on adding more elements later we can use addcapacity() (or something >_<) and it would then increase the capacity of the array to say 7 but the size would then be accurate and remain as 3… so in 0 1 2 changed to 0 1 2 0 0 0 0 it would return size as 3 and capacity as 7 …

        i guess just confused or complaining that re-sizing the array to allow for more elements later would automatically mark those new elements as used and added to the size of the array.

  • Ola Sh

    I am loving your tutorials. In your first example of std::vector, do you need to include the const declaration after the auto type? I thought this is not necessary since references are implicitly constant. They can point(refer) to only one value. Thanks.

    • Alex

      No, we don’t need the const, but making the reference const ensures we don’t use the reference to alter the original array (since we’re intending to use it for read-only purposes). Without the const, we could assign something to the element and change the underlying array.

  • Evaldas

    Hey,
    I really like this website and respect your work towards it, it has really helped me learn C++ and the tutorials are amazing!
    I have one question. Why does my cap go from 0, 1, 2 and then instantly to 4? I tried pushing in another element, and it still went like this:
    Cap 0 Size 0
    Cap 1 Size 1
    Cap 2 Size 2
    Cap 4 Size 3
    Cap 4 Size 4

    • Alex

      C++ doesn’t specify how std::vector should grow, so it’s up to the implementers to determine. Someone clearly thought it made more sense to grow non-linearly (and I agree).

  • Narasimha

    The range for the subscript operator ([]) and at() function is based on the the vector’s size, not the capacity. Consider the array in the previous example, which has size 3 and capacity 5. What happens if we try to access the array element with index 4?

    My observation is different here:
    As there is no bound check in C++ for subscript operator([]),we can still access element with index4, Output in my system is 4
    array = { 0, 1, 2, 3, 4 }; // okay, array length = 5
    New array = { 9, 8, 7 }; // okay, array length is now 3!
    Array capacity is still 5, only first 3 elements are replaced in memory and last 2 elements(3,4) remain as such in memory

    at() will internally check the size of vector before accessing a element. So, we can expect a run-time error while accessing the 4th element as size of vector is 3

    • Alex

      You _can_ access element 4 via the subscript operator, but this is considered an improper access since only elements up to the size of the vector are considered as valid elements (note I use the word valid, not allocated).

      In fact, in debug mode, Visual Studio 2015 will throw an assertion if you try to do this.

  • Sławomir Błauciak

    I think it’s also worth noting that the amount of memory being added to std::vector’s capacity isn’t always linear. It will eventually start increasing the capacity by more than 1.
    I presume it’s mostly compiler implmentation dependent, right?

  • Espen Andreas Larsen

    Wow, wouldn’t this stack functionality be perfect for a deck of cards like in the blackjack program? I’m gonna test it.

  • Mekacher Anis

    Alex thank’s for your amazing lessons
    I wanna ask :
    -is the name of the vector a pointer to memory on the heap or stack ?
    - where does the vector keep his capacity and size information as long as it’s name is just a pointer to the beginning of an array ?

    • Alex

      Nope.

      Consider the following struct:

      names is the name of the variable of type MyVector. Inside of MyVector is a member pointer that could be used to allocate memory, and an integer that can be used to hold the size. We’d access the memory allocated by this My Vector as names.ptr.

      std::vector works almost identically. When we say “std::vector names”, names refers to the std::vector object. Inside the std::vector object, there is a variable that is used for dynamically allocating memory, and another one that holds the vector’s size. However, unlike our MyVector struct, with std::vector, we can’t access the pointer and size variables directly. We have to use the provided functions (such as size()).

      Once we cover classes, you’ll better understand how std::vector is set up.

      • Mekacher Anis

        thank’s alot , and I know the basic things about OOP , so if I’m right the is a pointer to an instance of class "vector" and due to encapsulation we can’t access it only throw member functions ? and again thanks .

        • Alex

          The vector itself can be a normal object or a pointer depending on how it’s allocated:

          And yes, due to encapsulation you can’t access the internal members of vector directly -- you have to use the member functions provided.

  • Anton

    I’m a little confused here. So whereas declarations of dynamic arrays using the "new" operator are taken place in the heap, declarations using std::vector are taken place in the stack.

    Doesn’t that imply that if one wants to declare huge arrays, they better be done so with the "new" operator whereas smaller arrays better be declared with the std::vector statement? So when is it more suitable to use "new" and when is it more suitable to use constructs such as std::vector? What are the space limitations of std::vector compared to "new? Wouldn’t too big arrays declared with std::vector lead to stack overflows?

    • Alex

      Not quite right.
      Fixed arrays (e.g. int x[4]) and objects of type std::array allocate memory on the stack. Allocating huge fixed arrays or std::array objects could cause a stack overflow.
      Dynamic arrays (e.g. int *ptr = new int[4]) and objects of type std::vector allocate memory on the heap (note: the pointer or object itself is on the stack, but the memory it uses is on the heap).

      So, for large arrays, it’s fine to use vectors. In almost all cases, you should use std::vector instead of new, as std::vector will clean up after itself, and remembers its size.

  • Hello, alex. I’m not a expert of C++, but have 2 year programing experience with this phenomenal computer programing language. I read this above post and its very nice. I need your help to understand what is the use of "  std" in your above program, please help me.

    • Sean Kelly

      "std::" is telling the compiler that the function he is calling is part of the standard library.

      • Alex

        More specifically, std:: is telling the compiler that the object I am referencing lives inside a namespace named std. It is true that all of the functionality in the standard library lives inside namespace std.

  • MPreloaded

    "Unlike array subscripts or at(), the stack-based functions will resize the std::vector if necessary. In the example above, the vector gets resized 4 times (from a capacity of 0 to 1, 1 to 2, 2 to 3, and 3 to 4)."

    But in your output there was a maximum capacity of 3. Am i missing something?

  • Reaversword

    Hmmm… as dynamic memory a vector is, I have a question.

    How vector does reallocate memory, but is able to conserve their original memory address?

    For example:

    I was expecting, looking at memory address of a "reserved" vector (with a variation in their capacity), to find a different memory address (as it is randomly assigned), but vector is able to maintain it, so my theory is: "reserve new memory on the heap, copy vector elements, delete original vector" can’t be true except if the name of the vector is a pointer to a pointer that point to the real first element memory address of the vector. And the cout revealing the first element address appears to confirm my theory.

    When you say that resize vector is expensive, what happens exactly behind the scenes? Am I right?.

    2 question in 1 comment!. Sorry!.

    By the way, if I’m not wrong, when we change the capacity of a vector, pointers pointing to vector elements would become dangling pointers, isn’t it?. Or vector is even able to manage something like that (it would surprise me)?.

    • Alex

      Consider a pointer and some made up memory addresses:

      When we dynamically allocate memory, the address of that memory (0x0022222) is assigned to the pointer. But the address of the pointer itself doesn’t change (&ptr still is 0x00111111). std::vector works similarly in that &newVector will always return the address of the vector variable itself, not the dynamic memory it points to.

      When I say that resizing a vector is expensive, here’s what happens when we resize a dynamic array:
      * We need to allocate a new array of the correct size (relatively inexpensive)
      * We then need to copy all of the elements from the old array to the new array (this can take a lot of time)
      * We then need to delete the old array (usually inexpensive unless the object being deleted has special cleanup requirements)

      The step of copying the elements from the old array to the new array is typically expensive from a performance perspective, particularly if the array is large.

      And yes, if you change the capacity of a vector, all pointers pointing to vector elements would become dangling, because those elements are getting copied to a new location when the vector is resized. Very good use of deductive reasoning.

  • Naga Pushkal

    Hi Alex,

    I am bit confused with inserting elements into vector array using initializer list and stack functions(push_back() function). What is the difference between these two?
    And also I am getting following compiler error if I try to initialize vector array using initializer list.

    error C2552: ‘array2’ : non-aggregates cannot be initialized with initializer list
    1>          ‘std::vector<_Ty>’ : Types with a base are not aggregate
    1>          with
    1>          [
    1>              _Ty=int
    1>          ]

    I am using VS2010.

    • Alex

      Initializer lists are used to initialization or assign the vector into a particular state (e.g. you’re specifying all of the values in the vector at once). Stack functions allow you to push and pop values as you desire. Note that these aren’t exclusive: you can use an initializer list to set the vector into an initial state, and then pop the values off one-by-one (or push others on).

      VS2010 doesn’t have much support for initializer lists. Upgrade to 2013 or newer and they will work better.

Leave a Comment

Put C++ code inside [code][/code] tags to use the syntax highlighter