Once you’ve created strings, it’s often useful to know how long they are. This is where length and capacity operations come into play. We’ll also discuss various ways to convert std::string back into C-style strings, so you can use them with functions that expect strings of type char*.
Length of a string
The length of the string is quite simple -- it’s the number of characters in the string. There are two identical functions for determining string length:
size_type string::length() const size_type string::size() const
Sample code:
Output: 9 |
Although it’s possible to use length() to determine whether a string has any characters or not, it’s more efficient to use the empty() function:
bool string::empty() const
Sample code:
Output: false true |
There is one more size-related function that you will probably never use, but we’ll include it here for completeness:
size_type string::max_size() const
Sample code:
Output: 4294967294 |
Capacity of a string
The capacity of a string reflects how much memory the string allocated to hold its contents. This value is measured in string characters, excluding the NULL terminator. For example, a string with capacity 8 could hold 8 characters.
size_type string::capacity() const
Sample code:
Output: Length: 8 Capacity: 15 |
Note that the capacity is higher than the length of the string! Although our string was length 8, the string actually allocated enough memory for 15 characters! Why was this done?
The important thing to recognize here is that if a user wants to put more characters into a string than the string has capacity for, the string has to be reallocated to a larger capacity. For example, if a string had both length and capacity of 8, then adding any characters to the string would force a reallocation. By making the capacity larger than the actual string, this gives the user some buffer room to expand the string before reallocation needs to be done.
As it turns out, reallocation is bad for several reasons:
First, reallocating a string is comparatively expensive. First, new memory has to be allocated. Then each character in the string has to be copied to the new memory. This can take a long time if the string is big. Finally, the old memory has to be deallocated. If you are doing many reallocations, this process can slow your program down significantly.
Second, whenever a string is reallocated, the contents of the string change to a new memory address. This means all references, pointers, and iterators to the string become invalid!
Note that it’s not always the case that strings will be allocated with capacity greater than length. Consider the following program:
1 2 3 |
string sString("0123456789abcde"); cout << "Length: " << sString.length() << endl; cout << "Capacity: " << sString.capacity() << endl; |
This program outputs:
Length: 15 Capacity: 15
(Results may vary depending on compiler).
Let’s add one character to the string and watch the capacity change:
1 2 3 4 5 6 7 8 |
string sString("0123456789abcde"); cout << "Length: " << sString.length() << endl; cout << "Capacity: " << sString.capacity() << endl; // Now add a new character sString += "f"; cout << "Length: " << sString.length() << endl; cout << "Capacity: " << sString.capacity() << endl; |
This produces the result:
Length: 15 Capacity: 15 Length: 16 Capacity: 31
void string::reserve() void string::reserve(size_type unSize)
Sample code:
Output: Length: 8 Capacity: 15 Length: 8 Capacity: 207 Length: 8 Capacity: 207 |
This example shows two interesting things. First, although we requested a capacity of 200, we actually got a capacity of 207. The capacity is always guaranteed to be at least as large as your request, but may be larger. We then requested the capacity change to fit the string. This request was ignored, as the capacity did not change.
If you know in advance that you’re going to be constructing a large string by doing lots of string operations that will add to the size of the string, you can avoid having the string reallocated multiple times by immediately setting the string to its final capacity:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
#include <iostream> #include <string> #include <cstdlib> // for rand() and srand() #include <ctime> // for time() using namespace std; int main() { std::srand(std::time(nullptr)); // seed random number generator string sString{}; // length 0 sString.reserve(64); // reserve 64 characters // Fill string up with random lower case characters for (int nCount{ 0 }; nCount < 64; ++nCount) sString += 'a' + std::rand() % 26; cout << sString; } |
The result of this program will change each time, but here’s the output from one execution:
wzpzujwuaokbakgijqdawvzjqlgcipiiuuxhyfkdppxpyycvytvyxwqsbtielxpy
Rather than having to reallocate sString multiple times, we set the capacity once and then fill the string up. This can make a huge difference in performance when constructing large strings via concatenation.
![]() |
![]() |
![]() |
1. Feedback on pedagogy
This method of teaching seems much better as you explain each member function with an example.
I can just copy-paste each part into 1 main function.
2. Feedback on endl
Should we use endl or '\n'?
3. std::srand(std::time(nullptr));
It will return error C4244 conversion from 'time_t' to 'unsigned int'.
Hello Alex and nascardriver,
May you add std::prefixes beofore cout, string and so on?
And why you use std::endl here instead of '\n'?
Hubert
Hi
this lesson doesn't add any value to the tutorials, I've marked it for being removed.
Decided to find out whether or not there is a common ratio for string growth. I was correct!
Why when using resize(50), thought name is empty, I am getting result:
"name is not empty"
`resize` appends characters to the string to get to the new size. All those characters are null-terminators, so you don't see anything when you print the name.
Thanks nascardriver for all
your efforts
What does exactly size_type do? Is it an unsigned int? Is this the standard of C++ library?
`size_type` is used as a generic type for anything with a size in the standard library. It's type isn't the same for all containers. For `std::string`, it's `std::size_t` (Which is an unsigned integer, but not necessarily an `unsigned int`).
I see. Thanks.
"...but not necessarily an 'unsigned int'"
Do you mean to say there's some difference between 'unsigned integer' and 'unsigned int'?
Yes, they're different. `unsigned int` means
whereas unsigned integer is any integer type that is unsigned, ie.
`std::size_t` is any one of those, but it's not one in specific.
Oh it just slipped off my mind! Thanks for reminding!
My understanding is that if you request a larger capacity, then the run-time will try to increase the size of the memory block which may or may not succeed depending on whether the next memory is free or being used. If it is free, then the increase works without reallocation. If it is not free or not enough of it is free, then a reallocation is needed. If you create a string and the next memory operation is to increase the size of the string, there is a very good chance that the increase can be made into free memory, whereas if you create several strings, then try increasing the size of one of the first strings, that increase will trigger a reallocation.
Trevor
Not quite. When you create a vector, it might reserve more space than required. Say you create a vector with 10 elements, the vector will actually allocate memory for 16 (Varies depending on the implementation) elements. When you go to push the next element, the memory has already been allocated beforehand, so no reallocation is necessary. When you get to insert the 17th element, the vector has to reallocate. Then it will allocate memory for eg. 32 elements, copy the 16 elements over, and insert the 17th.
Another approach, one that is common for strings, is to allocate the exact amount of memory needed (Because it's rare that a string grows). When you append something, a reallocation happens. When you remove something from the string, no reallocation is made. The original memory is still reserved, but not used. When you append something after you shortened the string, that memory can be reused without a reallocation (Unless the string after appending is too long).
EDIT: I thought I received a notification for your comment and didn't look at the post date. I'll leave this reply for other people.
Hello and thanks for the lesson!
I'm wondering about reserve(). You say that if you use this function without entering a parameter you ask the compiler to shrink the capacity to match the length of the string, but the compiler may choose to ignore that request. Why does the compiler sometimes ignore the request? Is there any way to predict in which situations the compiler will ignore the request and in which situations it will honor the request? I'm a bit confused that the compiler is given free reign to do what it wants in a seemingly random way.
I presume it's to give the internal implementation some leeway in how it deals with memory allocation or optimization.
I don't think there's any way to predict what it will do.
Classes such as these are designed to make generalized tradeoffs between time and space. If you require absolute control over one or the other, you might be better off writing your own class in that case.
Each memory block that is allocated also has a small header that holds the housekeeping information for the allocation software. This might be 8-16 bytes. Normally the starting address of these headers will be aligned to multiples of their size, and the actual amount of memory allocated will be a multiple of this header size too. Alex's machine would appear to allocate memory in multiples of 16 bytes. (Note that the amount of memory allocated for the string has to include the terminating zero, so the 8 character string needs 9 bytes rounded up to 16, resulting in a capacity of 15 after the space for the terminating zero is allowed for.)
However I have no explanation for why Jan's machine came up with a capacity of 200.
One of the design considerations for memory management is trying to avoid too much fragmentation of the memory, and I suspect this is why Alex's machine didn't shrink the capacity of the string down from 207 to 15.
Trevor
"Second, whenever a string is reallocated, the contents of the string change to a new memory address. This means all references, pointers, and iterators to the string become invalid!"
I tried this but cannot make the string reallocate.
Hi Pointer!
Thanks nascardriver!
I also tested it with reference and iterator and the iterator indeed became invalid.
"Second, whenever a string is reallocated, the contents of the string change to a new memory address. This means all references, pointers, and iterators to the string become invalid!"
should be changed to
"Second, whenever a string is reallocated, the contents of the string change to a new memory address. This means all iterators to the string become invalid!".
Pointers and references to the string object seems to hold valid. Iterators point to the actual C-string and become invalid when the string is reallocated.
Alex, I've tried to run this code:
string sString("01234567");
cout << "Length: " << sString.length() << endl;
cout << "Capacity: " << sString.capacity() << endl;
sString.reserve(200);
cout << "Length: " << sString.length() << endl;
cout << "Capacity: " << sString.capacity() << endl;
sString.reserve();
cout << "Length: " << sString.length() << endl;
cout << "Capacity: " << sString.capacity() << endl;
and I got this:
Length: 8
Capacity: 15
Length: 8
Capacity: 200
Length: 8
Capacity: 15
instead of yours (see the last line Capacity 207):
Length: 8
Capacity: 15
Length: 8
Capacity: 207
Length: 8
Capacity: 207
I presume by doing sString.reserve(); you actually tried to lower the capacity to 0. But since there is some content in the string object still present ("01234567" in this case) the actual size reserved gets to 15, which you would get by default (when simply storing "01234567" in the first place).
Am I right or am I left ?
As the lesson indicate, calling reserve() with no parameter tries to shrink the capacity to fit the contents. So this was a request to fit the capacity to the string "01234567". However, the compiler is free to do whatever it wants with this request. It could shrink the capacity to 8, to 15, or ignore the request and leave it at 200 or 207 or whatever it was. In my case, the compiler ignores the request. In your case, the compiler honors your request but opts to leave some extra capacity for future appending.
Hi Alex,
I tried something like this :-
In this snippet, First, the length & capacity are 15, No problem ! But, when i add a new character to a string ('f'), the capacity changes, which means re-allocation ! Yes?
So, acc. to your lesson text, shouldn't the memory address of the string change?
Waiting for your reply.....
Thanks in Advance :)
Yes, if the capacity changes, that means a reallocation happened.
std::string is a class, and memory is allocated via one of the class members. The address held by that pointer member changes when there is a reallocation. The address for the string as a whole does not change.
Hi Alex,
In your last program, line 17 you have:
sString += 'a' + rand() % 26;
I don't understand this line. For instance if I write:
cout << 'a' + 5 << endl;
I get the (numerical) answer of 102 which is the value of ascii a + 5.
How do you get a char value instead of a numerical value.
Thanx Jason.
The expression 'a' + 5 evaluates to integer 102, which std::string dutifully prints as an integer.
However, std::string's operator+= doesn't have an overload that deals with integers, so it can't match sString.operator+=(102) as an integer. It does have an overload dealing with chars though, and not finding a better match, the compiler will choose that one. So 102 is treated as an ascii code, and the letter 'f' is added to the string.
Thanks Alex for this explanation. I had the same query.
thanks for this. I had the same query.
Hello. It's probably really simple, but what is the syntax for
? I don't understand the size_type or const being before and after the member function.
Also, is there any way to make an account on this website? I see a lot of people with profile pictures.
Yup, really easy. size_type is the return value of the function. What is a size_type? It's a integer typedef that is set to ensure that the length() function can return an integer of the appropriate size so overflow doesn't occur if length() is really large. The const after the function means the function itself is const, and guarantees that it won't change the members of the class. This also allows it to be called on const objects of class string.
Ah, I see. Thanks for refreshing me on using const after functions. Does size_type itself adjust when needed, or does it simply start off large?
It's a typedef, so it's defined in some #include file somewhere, and fixed at compile time. So it starts off large.
I've turned off accounts on this website, since bots were making tons of spam accounts. If you want an avatar, you can do so by setting up a gravatar (see gravatar.com).
Size_type is a type definition that resolves to an unsigned integral type, like unsigned int. The const declaration indicates that the function does not change the value of member variables, in this case the value of the string. Indicating a member function as constant makes the function accessible to constant objects of the class ( since the function would not attempt to change the member variables of such const objects).
I don't know how to create accounts on this site, but Alex can help you with that.
Typo ("it's" should be "its"): "...by immediately setting the string to it’s final capacity:"
Fixed. Thanks!
Hello Alex,
I have been playing with string::size() and string::length() functions a bit and they both return number of bytes the string occupies, not the number of characters (g++ 4.6.3). I have discovered this only because our (Czech) alphabet has all kinds of weird characters, which take two bytes each. Anyway, GREAT tutorial, have learned a lot from it, thanks!!!
Lukas