Search

7.10 — std::vector capacity and stack behavior

In lesson 6.16 -- An introduction to std::vector, we introduced std::vector and talked about how std::vector can be used as a dynamic array that both remembers its length and can be dynamically resized as required.

Although this is the most useful and commonly used part of std::vector, std::vector has some additional attributes and capabilities that make it useful in some other capacities as well.

Length vs capacity

Consider the following example:

We would say that this array has a length of 10, even though we’re only using 5 of the elements that we allocated.

However, what if we only wanted to iterate over the elements we’ve initialized, reserving the unused ones for future expansion? In that case, we’d need to separately track how many elements were “used” from how many elements were allocated. Unlike a built-in array or a std::array, which only remembers its length, std::vector contains two separate attributes: length and capacity. In the context of a std::vector, length is how many elements are being used in the array, whereas capacity is how many elements were allocated in memory.

Taking a look at an example from the previous lesson on std::vector:

The length is: 5
0 1 2 0 0

In the above example, we’ve used the resize() function to set the vector’s length to 5. This tells variable array that we’re intending to use the first 5 elements of the array, so it should consider those in active use. However, that leaves an interesting question: what is the capacity of this array?

We can ask the std::vector what its capacity is via the capacity() function:

On the authors machine, this printed:

The length is: 5
The capacity is: 5

In this case, the resize() function caused the std::vector to change both its length and capacity. Note that the capacity is guaranteed to be at least as large as the array length (but could be larger), otherwise accessing the elements at the end of the array would be outside of the allocated memory!

More length vs. capacity

Why differentiate between length and capacity? std::vector will reallocate its memory if needed, but like Melville’s Bartleby, it would prefer not to, because resizing an array is computationally expensive. Consider the following:

This produces the following:

length: 5  capacity: 5
length: 3  capacity: 5

Note that although we assigned a smaller array to our vector, it did not reallocate its memory (the capacity is still 5). It simply changed its length, so it knows that only the first 3 elements are valid at this time.

Array subscripts and at() are based on length, not capacity

The range for the subscript operator ([]) and at() function is based on the the vector’s length, not the capacity. Consider the array in the previous example, which has length 3 and capacity 5. What happens if we try to access the array element with index 4? The answer is that it fails, since 4 is greater than the length of the array.

Note that a vector will not resize itself based on a call to the subscript operator or at() function!

Stack behavior with std::vector

If the subscript operator and at() function are based on the array length, and the capacity is always at least as large as the array length, why even worry about the capacity at all? Although std::vector can be used as a dynamic array, it can also be used as a stack. To do this, we can use 3 functions that match our key stack operations:

  • push_back() pushes an element on the stack.
  • back() returns the value of the top element on the stack.
  • pop_back() pops an element off the stack.

This prints:

(cap 0 length 0)
5 (cap 1 length 1)
5 3 (cap 2 length 2)
5 3 2 (cap 3 length 3)
top: 2
5 3 (cap 3 length 2)
5 (cap 3 length 1)
(cap 3 length 0)

Unlike array subscripts or at(), the stack-based functions will resize the std::vector if necessary. In the example above, the vector gets resized 3 times (from a capacity of 0 to 1, 1 to 2, and 2 to 3).

Because resizing the vector is expensive, we can tell the vector to allocate a certain amount of capacity up front using the reserve() function:

This program prints:

(cap 5 length 0)
5 (cap 5 length 1)
5 3 (cap 5 length 2)
5 3 2 (cap 5 length 3)
top: 2
5 3 (cap 5 length 2)
5 (cap 5 length 1)
(cap 5 length 0)

We can see that the capacity was preset to 5 and didn’t change over the lifetime of the program.

Vectors may allocate extra capacity

When a vector is resized, the vector may allocate more capacity than is needed. This is done to provide some “breathing room” for additional elements, to minimize the number of resize operations needed. Let’s take a look at this:

On the author’s machine, this prints:

size: 5  cap: 5
size: 6  cap: 7

When we used push_back() to add a new element, our vector only needed room for 6 elements, but allocated room for 7. This was done so that if we were to push_back() another element, it wouldn’t need to resize immediately.

If, when, and how much additional capacity is allocated is left up to the compiler implementer.

7.11 -- Recursion
Index
7.9 -- The stack and the heap

116 comments to 7.10 — std::vector capacity and stack behavior

  • Jonathan

    Just want to share my code to delete vector element. Correct me if there are any suggestions!!

    • - Limit your lines to 80 characters in length for better readability on small displays.
      - Don't pass 32767 to @std::cin.ignore. Pass @std::numeric_limits<std::streamsize>::max().
      - You're using different iterator names everywhere. Variables are local, you can and should re-use names.
      - Pass fundamental types (int) by value.
      - Line 21+: `tempVector[tempVector.size()]` doesn't exist. This causes undefined behavior.
      - Inconsistent formatting. Use your editor's auto-formatting feature.
      - Line 69 has no effect. All elements are default-initialized.
      - Line 78, 83: Duplicate comparison.
      - Line 83: `return (yn == 'y')`.
      - Line 100: Should be a do-while loop.
      - If your program prints anything, the last thing it prints should be a line feed.
      - Lots of signed/unsigned comparisons and conversions.
      - You're not resizing the vector. This is a waste of memory and now you vector only works in combination with `length`.
      - Line 80+: Should be a do-while loop.

      Your idea is somewhat correct, but the implementation is rather rough.
      For one, you could've used `std::vector::erase`. Since this replaces your entire code, I suppose you either didn't know about it or wanted to try implementing it yourself.
      There's no need to create a new vector. All you need to do is shift every element after the one that's being deleted to the left.

      Here's a version of your program I wrote. Try to understand it. If you don't understand something or why I did something, please ask.
      I used `std::vector::empty` and `std::vector::resize`. C++ is well documented, you can look up classes on cppreference ( https://en.cppreference.com/w/cpp/container/vector ) to see what they can do and how to use them.

      • Jonathan

        Thank you so much for writing those codes, I understand most of the code but I have a few questions:
        1. Why don't we just write vecInput[i] in line 45?
        2. How is type "std::vector<int>::size_type" different from type "unsigned int" ?
        3. line 108 shouldn't it be

        instead? If length = 3 which holds 3 elements (0, 1, 2) the threshold "1 <= length <= 4" which passes to "readInput()"accept iDelete (0, 1, 2, 3)

        I also changed my code base on your code!!

        • 1/2.
          `std::vector::operator[]` wants an `std::vector::size_type` ( https://en.cppreference.com/w/cpp/container/vector/operator_at ).
          The only thing we know about `std::vector::size_type` is that it's an unsigned integer type ( https://en.cppreference.com/w/cpp/container/vector ). We don't know which type exactly. In your standard implementation, it might be an `unsigned int`, but that's not what it is everywhere.

          3.
          Yes it should, well spotted!

          > I also changed my code base on your code!!
          - Line 45: That's a nice idea.
          - Line 13: This can (very likely) cause an integer underflow. Wrap line 15-20 in an `else` block instead.
          - Line 52: Should be `char yn{}`, because you're not using the initial value.
          - You can use `shouldContinue()` as the loop's condition, then you don't need `cont`.

          Looks good now, keep on posting your solutions :-)

  • Nguyen

    Hi nascardriver,

    This lesson seems tough for me.  I was wondering if I can safely skip it for now and move to the next lessons or chapter 8?

    Thanks, Have a great day

    • You can skip it for now (and continue at 7.11), but return to it later.

    • Matt

      Nguyen,

      For me it is helpful to think of the std::vector like a bucket of water.

      capacity() tells you how big the bucket is (thus, how much water it can possibly hold)
      size() tells you how much water is currently in the bucket
      resize() tells the bucket that there is more water in it, and so the bucket will automatically get larger so it can hold all that extra water (maybe some extra)
      push_back() adds some water to the bucket, if the bucket is full it will automatically resize() the bucket to make room
      back() takes a look in the bucket and sees what the water at the top looks like, but it doesn't pour the water out
      pop_back() pours some water out of the bucket, the bucket will not necessarily get smaller just because we poured water out, though

      I'm not an expert but this is the way my brain works, hope it helps!
      -Matt

  • Lakshya Malhotra

    Hi,
    So according to the discussion on the topic above: "More length vs. capacity", it was discussed that "Note that although we assigned a smaller array to our vector, it did not reallocate its memory (the capacity is still 5). It simply changed its length, so it knows that only the first 3 elements are valid at this time."
    But when I try to do something like this:

    I would assume that this line

    would give me errors becuase the compiler knows only the first three elements are valid this time. But I am getting output :
    ```
    length: 5   capacity: 5
    length: 3   capacity: 5
    2nd element: 7
    4th element: 3
    9
    7
    8
    ```
    Can you explain what is going on here? Thanks in advance!!

    • @std::vector::operator[] doesn't perform bounds checks. As long as you don't leave the capacity, there shouldn't be any errors.
      If you don't trust yourself with keeping track of the indexes and length, you can use @std::vector::at, which does perform a bounds check.

  • NXPY

    Hi Alex !

    I just want to point out that you forgot to include the iostream header file in the last code .

    Also in this code :

    The capacity changes to 10 when I resize it .
    The capacity or the length of the vector v does not change when I access v[11] . So how is it stored ? Also if I try to access v[12] the program stops responding even though it had no problem with v[11]. Why is this so ?

    • > The capacity changes to 10 when I resize it
      @std::vector::push_back can increase the capacity as much as is likes, there's no standard.

      > The capacity or the length of the vector v does not change when I access v[11]
      Access operations on vectors don't create new elements. @std::vector::operator[] doesn't perform any bound checks on the given index. You're accessing memory which you don't own, causing undefined behavior. Both with v[11] and v[12].

  • Aqua

    Thank you for these tutorials and explanations. std::vector<> is such an incredible tool. My only question would be: If needed is it possible to, more efficient, if we resized the capacity ourselves via reserve() if we hit the maximum capacity? So far I understand vector is dynamic. Resizing a vector is costly. vector has stack like qualities? Although since a vector is dynamic it is allocated on the heap? hmmm. I guess those are also questions to better my understanding. new is the only call to allocate memory on the heap and everything else uses the stack?

    • Hi!

      Use of heap and stack isn't standardized. However, it's common for everything allocated dynamically (eg. using @new or @malloc), to be allocated on the heap and everything else on the stack.

      If you know beforehand how big your vector is going to be, you should reserve space.
      If you don't know how big the vector is going to be, there's no benefit in resizing it manually.

  • jasonguo

    So, the capacity is basically the largest length the stack had? Is it not possible to shorten the capacity?

    • Alex

      It is possible to shorten the capacity. Maybe the best way to do so is via the shrink_to_fit function (introduced in C++11), which is a non-binding (meaning it may or may not work) method to reduce the capacity down to the size.

  • Ahmed

    If you let me, What's the point of using std::vector with the stack while I can use built-in arrays or std::array?
    Is it just for performance reasons? that it doesn't resize (or I can control the capacity)which it is expensive for the CPU.

  • Hi,

    The stack commands - pop(), push(), peek() - sounded awfully familiar.  pop and push, I suppose, are similar to their assembly language counterparts but peek took a little more thought.  I don't suppose it is related in any way to the PEEK command used in early BASIC (home computers, UNIX systems) from the 80's?  This, however, would then require a POKE to insert a character into a specific memory address?  It was a long time ago since I played with BASIC (Commodore Amiga I seem to recall).

    • Hi Nigel!

      push and pop in assembly are used to control the stack, so yes, they should sound familiar. I don't know BASIC, but from your description PEEK sounds like a function to access random memory, whereas here it's used to read the top of the stack without popping.

  • Liam

    Why is it that some functions are formatted like "array.function()" whereas others are formatted like "function(array)" ?

    I know that the answer to this question is written somewhere in these tutorials, but I can't find it.

  • Peter Baum

    It was only later in the lessons that I realized that there is something fundamental that I didn't understand at this point.  std::array was introduced and that used the stack and then "new" was used and that used the heap.  But suppose I want to use std::array and make sure it uses the heap.  How would I do that?  An example that does this and allows access from a function would be very helpful.

    Also I saw a comment at stackoverflow awhile back about the stack and heap in which someone said that from the point of view of CPP there is no stack or heap.  In trying to find that discussion (which I didn't) I ran across https://stackoverflow.com/questions/79923/what-and-where-are-the-stack-and-heap which is very good, and learned that individual threads get their own stack.  In that discussion (down a ways) I found "you are free to implement a compiler that doesn't even use a stack or a heap, but instead some other storage mechanisms." and that implies the disconnect between CPP and these storage areas/techniques.  In any case, as a practical matter, IMHO in these lessons, you really can't easily and should not in any case, separate CPP from important OS considerations like this.

    • nascardriver

      Hi Peter!

      > But suppose I want to use std::array and make sure it uses the heap.  How would I do that?
      You pretty much answered your own question in your quote "you are free to implement a compiler that doesn't even use a stack or a heap, but instead some other storage mechanisms.".
      The standard doesn't require compiler developers to use the stack for X and the heap for Y, they're free to do whatever they want.
      If you need control over heap and stack you'll have to use compiler/OS/CPU specific solutions.

    • Alex

      If you want std::array to use the heap, then you need to dynamically allocate it on the heap in the first place:

      However, I'd advise against this, as using std::vector is a better choice in such a case (as std::vector will clean itself up, whereas std::array on the heap relies on you to remember to do so).

      • Peter Baum

        Hi Alex,
        I tried something like that and ran into a problem with Visual Studio Community 2017.  Maybe I'm just doing something stupid, but this was the code:

        The at() function worked but not the usual array element access method with square brackets.

        Also, I was avoiding vector because of the behind the scenes garbage collection and I was, after all, trying to do timing.

        • nascardriver

          @array is a pointer, you can't use pointer operators like that.

          • Peter Baum

            Hi nascardriver,

            I certainly hadn't tried that!!! Where do you come up with this stuff!?!?  Did I miss a lesson along the way?

            I am still confused though, because we saw in a previous lesson that we could do

            and here "array" is also a pointer.  So what is the difference?

            • nascardriver

              > Where do you come up with this stuff!?!?  Did I miss a lesson along the way?
              I've been coding for a couple of years, learncpp doesn't cover every tiny piece of cpp.

              > So what is the difference?
              All pointers have the operator[], but it treats the underlying data as an array.

              Let 'a' be an int pointer (or array), 'b' an integer.
              The following rules apply (The casts make it look more complicated than it is)

              Applying this to your code

              • Peter Baum

                Hi nascardriver,

                I followed what you said and it is helpful with respect to your alternative ways of accessing the array element.  However, what I don't understand is why, in my original post, you get a compiler error with

                but in the printit() routine it is valid to ask for cout to output element array[2].

                • nascardriver

                  @array in @printit is an @std::array, @array in @main is an @std::array pointer.

                  std::array::operator[] returns an int, but
                  std::array pointer operator[] returns an std::array.

                • Peter Baum

                  Ahah... I now see what my problem is... we were taught that an array parameter would not be copied but passed as a pointer.  Apparently this is not what happens with a standard array.

                  What an annoying inconsistency on the part of CPP.  I could see adding functions like .at() that provide a special service, but why would the designers of CPP make the array aspect of standard array behave so differently from ordinary arrays?

                • nascardriver

                  std::array isn't a C-style array so it shouldn't behave like one, it should behave like a custom data type, because that's what it is.

  • Mireska

    When I try to access a vector array with an index smaller than it's capacity but larger than its length, I get a runtime error. I know it's bad procedure, but I'd like to understand what's going on; what happens to those indices when the length is shortened? They're in its capacity, so I would assume the vector still has 'control' over them'; I expected undefined behaviour, or maybe it those indices returning the value they once had.

    So my question: Why exactly does trying that make it crash? Does the compiler 'limit/restrain' itself from accessing that, even though it technically could, because it knows it's unintended (that's my guess as to why it gives an 'assertion' error)

    Many thanks!

    • nascardriver

      Hi Mireska!

      > Why exactly does trying that make it crash?
      It doesn't crash, it's a controlled termination to let you know you're doing something wrong.

      > Does the compiler 'limit/restrain' itself from accessing that, even though it technically could, because it knows it's unintended
      Yes

      Assertions are covered in Lesson 7.12a

Leave a Comment

Put all code inside code tags: [code]your code here[/code]