Search

17.3 — std::string length and capacity

Once you’ve created strings, it’s often useful to know how long they are. This is where length and capacity operations come into play. We’ll also discuss various ways to convert std::string back into C-style strings, so you can use them with functions that expect strings of type char*.

Length of a string

The length of the string is quite simple -- it’s the number of characters in the string. There are two identical functions for determining string length:

size_type string::length() const
size_type string::size() const

  • Both of these functions return the current number of characters in the string, excluding the null terminator.

Sample code:

Output:

9

Although it’s possible to use length() to determine whether a string has any characters or not, it’s more efficient to use the empty() function:

bool string::empty() const

  • Returns true if the string has no characters, false otherwise.

Sample code:

Output:

false
true

There is one more size-related function that you will probably never use, but we’ll include it here for completeness:

size_type string::max_size() const

  • Returns the maximum number of characters that a string is allowed to have.
  • This value will vary depending on operating system and system architecture.

Sample code:

Output:

4294967294

Capacity of a string

The capacity of a string reflects how much memory the string allocated to hold its contents. This value is measured in string characters, excluding the NULL terminator. For example, a string with capacity 8 could hold 8 characters.

size_type string::capacity() const

  • Returns the number of characters a string can hold without reallocation.

Sample code:

Output:

Length: 8
Capacity: 15

Note that the capacity is higher than the length of the string! Although our string was length 8, the string actually allocated enough memory for 15 characters! Why was this done?

The important thing to recognize here is that if a user wants to put more characters into a string than the string has capacity for, the string has to be reallocated to a larger capacity. For example, if a string had both length and capacity of 8, then adding any characters to the string would force a reallocation. By making the capacity larger than the actual string, this gives the user some buffer room to expand the string before reallocation needs to be done.

As it turns out, reallocation is bad for several reasons:

First, reallocating a string is comparatively expensive. First, new memory has to be allocated. Then each character in the string has to be copied to the new memory. This can take a long time if the string is big. Finally, the old memory has to be deallocated. If you are doing many reallocations, this process can slow your program down significantly.

Second, whenever a string is reallocated, the contents of the string change to a new memory address. This means all references, pointers, and iterators to the string become invalid!

Note that it’s not always the case that strings will be allocated with capacity greater than length. Consider the following program:

This program outputs:

Length: 15
Capacity: 15

(Results may vary depending on compiler).

Let’s add one character to the string and watch the capacity change:

This produces the result:

Length: 15
Capacity: 15
Length: 16
Capacity: 31
void string::reserve()
void string::reserve(size_type unSize)

  • The second flavor of this function sets the capacity of the string to at least unSize (it can be greater). Note that this may require a reallocation to occur.
  • If the first flavor of the function is called, or the second flavor is called with unSize less than the current capacity, the function will try to shrink the capacity to match the length. This is a non-binding request.

Sample code:

Output:

Length: 8
Capacity: 15
Length: 8
Capacity: 207
Length: 8
Capacity: 207

This example shows two interesting things. First, although we requested a capacity of 200, we actually got a capacity of 207. The capacity is always guaranteed to be at least as large as your request, but may be larger. We then requested the capacity change to fit the string. This request was ignored, as the capacity did not change.

If you know in advance that you’re going to be constructing a large string by doing lots of string operations that will add to the size of the string, you can avoid having the string reallocated multiple times by immediately setting the string to its final capacity:

The result of this program will change each time, but here’s the output from one execution:

wzpzujwuaokbakgijqdawvzjqlgcipiiuuxhyfkdppxpyycvytvyxwqsbtielxpy

Rather than having to reallocate sString multiple times, we set the capacity once and then fill the string up. This can make a huge difference in performance when constructing large strings via concatenation.

17.4 -- std::string character access and conversion to C-style arrays
Index
17.2 -- std::string construction and destruction

8 comments to 17.3 — std::string length and capacity

  • Lukas Linhart

    Hello Alex,

    I have been playing with string::size() and string::length() functions a bit and they both return number of bytes the string occupies, not the number of characters (g++ 4.6.3). I have discovered this only because our (Czech) alphabet has all kinds of weird characters, which take two bytes each. Anyway, GREAT tutorial, have learned a lot from it, thanks!!!

    Lukas

  • Matt

    Typo ("it’s" should be "its"): "…by immediately setting the string to it’s final capacity:"

  • Sheeplie

    Hello. It’s probably really simple, but what is the syntax for

    ? I don’t understand the size_type or const being before and after the member function.

    Also, is there any way to make an account on this website? I see a lot of people with profile pictures.

    • Alex

      Yup, really easy. size_type is the return value of the function. What is a size_type? It’s a integer typedef that is set to ensure that the length() function can return an integer of the appropriate size so overflow doesn’t occur if length() is really large. The const after the function means the function itself is const, and guarantees that it won’t change the members of the class. This also allows it to be called on const objects of class string.

      • Sheeplie

        Ah, I see. Thanks for refreshing me on using const after functions. Does size_type itself adjust when needed, or does it simply start off large?

        • Alex

          It’s a typedef, so it’s defined in some #include file somewhere, and fixed at compile time. So it starts off large.

          I’ve turned off accounts on this website, since bots were making tons of spam accounts. If you want an avatar, you can do so by setting up a gravatar (see gravatar.com).

    • Ola Sh

      Size_type is a type definition that resolves to an unsigned integral type, like unsigned int. The const declaration indicates that the function does not change the value of member variables, in this case the value of the string. Indicating a member function as constant makes the function accessible to constant objects of the class ( since the function would not attempt to change the member variables of such const objects).

      I don’t know how to create accounts on this site, but Alex can help you with that.

Leave a Comment

Put C++ code inside [code][/code] tags to use the syntax highlighter