Author’s note
Some of the content of this lesson was moved into the introduction to std::string_view lesson (4.18 -- Introduction to std::string_view). As a result, this lesson contains some duplicative content that has not been cleaned up yet. This will be addressed when this chapter is rewritten (soon).
In the previous lesson, we talked about C-style strings, and the dangers of using them. C-style strings are fast, but they’re not as easy to use and as safe as std::string
.
But std::string
(which we covered in lesson 4.17 -- Introduction to std::string), has some of its own downsides, particularly when it comes to const strings.
Consider the following example:
#include <iostream>
#include <string>
int main()
{
char text[]{ "hello" };
std::string str{ text };
std::string more{ str };
std::cout << text << ' ' << str << ' ' << more << '\n';
return 0;
}
As expected, this prints
hello hello hello
Internally, main
copies the string “hello” 3 times, resulting in 4 copies. First, there is the string literal “hello”, which is known at compile-time and stored in the binary. One copy is created when we create the char[]
. The following two std::string
objects create one copy of the string each.
So why does std::string
make a copy of the string used to initialize it? A std::string
provides some useful guarantees about the string data it manages:
- The string data will be valid for as long as the string object is alive, and cleaned up when the string object dies.
- The string’s value can only be modified by the string object.
- The string’s value can be modified without affecting other objects (for non-const strings).
A std::string
object has no control over what kind of string is used to initialize it (that is the caller’s responsibility) -- the caller could destroy or modify the initialization string immediately after the std::string
object is initialized. Post-initialization, the string object can not rely on the initializer in any way, otherwise the above guarantees could be violated.
By making a copy of the initialization string (that only it has access to), the string object can ensure that the value and lifetime of the string data is independent from other objects (including the initialization string).
However, in some cases, we don’t need these benefits (particularly for const std::string
objects, which can’t modify their value). Consider the case where we know a std::string
won’t outlive it’s initialization string, and that the initialization string won’t be modified. In such cases, we’re paying a high cost (making a copy of the initialization string) for benefits we don’t need (independence from the initialization string).
Introducing std::string_view
Consider a window in your house, looking at a car sitting on the street. You can look through the window and see the car, but you can’t touch or move the car. Your window just provides a view to the car, which is a completely separate object.
C++17 introduces another way of using strings, std::string_view
, which lives in the <string_view> header.
Unlike std::string
, which keeps its own copy of the string, std::string_view
provides a view of a string that is defined elsewhere.
We can re-write the above code to use std::string_view
by replacing every std::string
and C-style string
with std::string_view
.
#include <iostream>
#include <string_view>
int main()
{
std::string_view text{ "hello" }; // view the text "hello", which is stored in the binary
std::string_view str{ text }; // view of the same "hello"
std::string_view more{ str }; // view of the same "hello"
std::cout << text << ' ' << str << ' ' << more << '\n';
return 0;
}
The output is the same, but no more copies of the string “hello” are created. text
is only a view onto the string “hello”, so no copy has to be created. When we copy a std::string_view
, the new std::string_view
observes the same string as the copied-from std::string_view
is observing. This means that neither str
nor more
create any copies of the string. They are views onto the existing string “hello”.
std::string_view
is not only fast, but has many of the functions that we know from std::string
.
#include <iostream>
#include <string_view>
int main()
{
std::string_view str{ "Trains are fast!" };
std::cout << str.length() << '\n'; // 16
std::cout << str.substr(0, str.find(' ')) << '\n'; // Trains
std::cout << (str == "Trains are fast!") << '\n'; // 1
// Since C++20
std::cout << str.starts_with("Boats") << '\n'; // 0
std::cout << str.ends_with("fast!") << '\n'; // 1
std::cout << str << '\n'; // Trains are fast!
return 0;
}
Because std::string_view
doesn’t create a copy of the string, if we change the viewed string, the changes are reflected in the std::string_view
.
#include <iostream>
#include <string_view>
int main()
{
char arr[]{ "Gold" };
std::string_view str{ arr };
std::cout << str << '\n'; // Gold
// Change 'd' to 'f' in arr
arr[3] = 'f';
std::cout << str << '\n'; // Golf
return 0;
}
We modified arr
, but str
appears to be changing as well. That’s because arr
and str
share their string. When you use a std::string_view
, it’s best to avoid modifications to the underlying string for the remainder of the std::string_view
‘s life to prevent confusion and errors.
Best practice
Prefer std::string_view
over std::string
and C-style strings when you only need read-only access to a string (unless you can’t guarantee the string being viewed will stay alive for the lifetime of the std::string_view
, in which case you should prefer std::string
).
View modification functions
Back to our window analogy, consider a window with curtains. We can close either the left or right curtain to reduce what we can see. We don’t change what’s outside, we just reduce the visible area.
Similarly, std::string_view
contains functions that let us manipulate the view of the string. This allows us to change the view without modifying the viewed string.
The functions for this are remove_prefix
, which removes characters from the left side of the view, and remove_suffix
, which removes characters from the right side of the view.
#include <iostream>
#include <string_view>
int main()
{
std::string_view str{ "Peach" };
std::cout << str << '\n';
// Ignore the first character.
str.remove_prefix(1);
std::cout << str << '\n';
// Ignore the last 2 characters.
str.remove_suffix(2);
std::cout << str << '\n';
return 0;
}
This program produces the following output:
Peach each ea
Unlike real curtains, a std::string_view
cannot be opened back up. Once you shrink the area, the only way to re-widen it is to reset the view by reassigning the source string to it again.
std::string_view works with non-null-terminated strings
Unlike C-style strings and std::string
, std::string_view
doesn’t use null terminators to mark the end of the string. Rather, it knows where the string ends because it keeps track of its length.
#include <iostream>
#include <iterator> // For std::size
#include <string_view>
int main()
{
// No null-terminator.
char vowels[]{ 'a', 'e', 'i', 'o', 'u' };
// vowels isn't null-terminated. We need to pass the length manually.
// Because vowels is an array, we can use std::size to get its length.
std::string_view str{ vowels, std::size(vowels) };
std::cout << str << '\n'; // This is safe. std::cout knows how to print std::string_view.
return 0;
}
This program prints:
aeiou
Converting a std::string_view
to a C-style string
Some old functions (such as the old strlen function) still expect C-style strings. To convert a std::string_view
to a C-style string, we can do so by first converting to a std::string
:
#include <cstring>
#include <iostream>
#include <string>
#include <string_view>
int main()
{
std::string_view sv{ "balloon" };
sv.remove_suffix(3);
// Create a std::string from the std::string_view
std::string str{ sv };
// Get the null-terminated C-style string.
auto szNullTerminated{ str.c_str() };
// Pass the null-terminated string to the function that we want to use.
std::cout << str << " has " << std::strlen(szNullTerminated) << " letter(s)\n";
return 0;
}
This prints:
ball has 4 letter(s)
However, creating a std::string
every time we want to pass a std::string_view
as a C-style string is expensive, so this should be avoided if possible.
Prefer std::string_view
function parameters (over const std::string&
)
One question that comes up often in modern C++: when writing a function that has a string parameter, should the type of the parameter be const std::string&
or std::string_view
?
In most cases, std::string_view
is the better choice, as it can handle a wider range of argument types efficiently. We’ll explore why this is the case in the next section.
void doSomething(const std::string&);
void doSomething(std::string_view); // prefer this in most cases
There are a few cases where using a const std::string&
parameter may be more appropriate. First, if you’re using C++14 or older, std::string_view
isn’t available. Second, if your function needs to call some other function that takes a C-style string or std::string
parameter, then const std::string&
may be a better choice, as std::string_view
is not guaranteed to be null-terminated (something that C-style string functions expect) and does not efficiently convert back to a std::string
.
Best practice
Prefer passing strings using std::string_view
(by value) instead of const std::string&
, unless your function calls other functions that require C-style strings or std::string
parameters.
Author’s note
Many examples in future lessons were written prior to the introduction of std::string_view
, and still use const std::string&
for function parameters when std::string_view
should be preferred. We’ve converted most of these to use std::string_view
, but if you find any others, please leave a comment on the relevant lesson.
Why std::string_view
parameters are more efficient than const std::string&
(optional reading)
In C++, a string argument will typically be a std::string
, a std::string_view
, or a C-style string/string literal.
As reminders:
- If the type of an argument does not match the type of the corresponding parameter, the compiler will try to implicitly convert the argument to match the type of the parameter.
- Converting a value creates a temporary object of the converted type.
- Creating (or copying) a
std::string_view
is inexpensive, asstd::string_view
does not make a copy of the string it is viewing. - Creating (or copying) a
std::string
can be expensive, as eachstd::string
object makes a copy of the string.
Here’s a table showing what happens when we try to pass each type:
Argument Type | std::string_view parameter | const std::string& parameter |
---|---|---|
std::string | Inexpensive conversion | Inexpensive reference binding |
std::string_view | Inexpensive copy | Won’t implicitly convert Expensive explicit conversion to std::string |
C-style string / literal | Inexpensive conversion | Expensive conversion |
With a std::string_view
value parameter:
- If we pass in a
std::string
argument, the compiler will convert thestd::string
to astd::string_view
, which is inexpensive, so this is fine. - If we pass in a
std::string_view
argument, the compiler will copy the argument into the parameter, which is inexpensive, so this is fine. - If we pass in a C-style string or string literal, the compiler will convert these to a
std::string_view
, which is inexpensive, so this is fine.
As you can see, std::string_view
handles all three cases inexpensively.
With a const std::string&
reference parameter:
- If we pass in a
std::string
argument, the parameter will reference bind to the argument, which is inexpensive, so this is fine. - If we pass in a
std::string_view
argument, the compiler will refuse to do an implicit conversion, and produce a compilation error. We can usestatic_cast
to do an explicit conversion (tostd::string
), but this conversion is expensive (sincestd::string
will make a copy of the string being viewed). Once the conversion is done, the parameter will reference bind to the result, which is inexpensive. But we’ve made an expensive copy to do the conversion, so this isn’t great. - If we pass in a C-style string or string literal, the compiler will implicitly convert this to a
std::string
, which is expensive. So this isn’t great either.
Thus, a const std::string&
parameter only handles std::string
arguments inexpensively.
The same, in code form:
#include <iostream>
#include <string>
#include <string_view>
void printSV(std::string_view sv)
{
std::cout << sv << '\n';
}
void printS(const std::string& s)
{
std::cout << s << '\n';
}
int main()
{
std::string s{ "Hello, world" };
std::string_view sv { s };
// Pass to `std::string_view` parameter
printSV(s); // ok: inexpensive conversion from std::string to std::string_view
printSV(sv); // ok: inexpensive copy of std::string_view
printSV("Hello, world"); // ok: inexpensive conversion of C-style string literal to std::string_view
// pass to `const std::string&` parameter
printS(s); // ok: inexpensive bind to std::string argument
printS(sv); // compile error: cannot implicit convert std::string_view to std::string
printS(static_cast<std::string>(sv)); // bad: expensive creation of std::string temporary
printS("Hello, world"); // bad: expensive creation of std::string temporary
return 0;
}
Additionally, we need to consider the cost of accessing the parameter inside the function. Because a std::string_view
parameter is a normal object, it can be accessed directly. Accessing a std::string&
parameter is more expensive because the reference must be dereferenced to get to the object before the object can be accessed.
Ownership issues
A std::string_view
‘s lifetime is independent of that of the string it is viewing (meaning the string being viewed can be destroyed before the std::string_view
object). If this happens, then accessing the std::string_view
will cause undefined behavior.
The string that a std::string_view
is viewing has to have been created somewhere else. It might be a string literal that lives as long as the program does, or a std::string
, in which case the string lives until the std::string
decides to destroy it or the std::string
dies.
std::string_view
can’t create any strings on its own, because it’s just a view.
Here’s an example of a program that has an ownership issue:
#include <iostream>
#include <string>
#include <string_view>
std::string_view askForName()
{
std::cout << "What's your name?\n";
// Use a std::string, because std::cin needs to modify it.
std::string name{};
std::cin >> name;
// We're switching to std::string_view for demonstrative purposes only.
// If you already have a std::string, there's no reason to switch to
// a std::string_view.
std::string_view view{ name };
std::cout << "Hello " << view << '\n';
return view;
} // name dies, and so does the string that name created.
int main()
{
std::string_view view{ askForName() };
// view is observing a string that already died.
std::cout << "Your name is " << view << '\n'; // Undefined behavior
return 0;
}
What's your name? nascardriver Hello nascardriver Your name is �[email protected]�[email protected]
In function askForName()
, we create name
and fill it with data from std::cin
. Then we create view
, which can view that string. At the end of the function, we return view
, but the string it is viewing (name
) is destroyed, so view
is now pointing to deallocated memory. The function returns a dangling std::string_view
.
Accessing the returned std::string_view
in main
causes undefined behavior, which on the author’s machine produced weird characters.
The same can happen when we create a std::string_view
from a std::string
and then modify the std::string
. Modifying a std::string
can cause its internal string to die and be replaced with a new one in a different place. The std::string_view
will still look at where the old string was, but it’s not there anymore.
Warning
Make sure that the underlying string viewed with a std::string_view
does not go out of scope and isn’t modified while using the std::string_view
.
The data() function and non-null-terminated strings
The string being viewed by a std::string_view
can be accessed by using the data()
function, which returns a C-style string. This provides fast access to the string being viewed (as a C-string). But it should only be used in cases where we know the string being viewed is null-terminated.
In the following example, std::strlen
expects a C-style string, so we need to pass it str.data()
:
#include <cstring> // For std::strlen
#include <iostream>
#include <string_view>
int main()
{
std::string_view str{ "balloon" };
std::cout << str << '\n';
// We use std::strlen because it's simple, this could be any other function
// that needs a null-terminated string.
// It's okay to use data() because we haven't modified the view, and the
// string is null-terminated.
std::cout << std::strlen(str.data()) << '\n';
return 0;
}
balloon 7
In the next example, we’ll show a case where the std::string_view
is modified so that it is no longer viewing a null-terminated string:
#include <cstring>
#include <iostream>
#include <string_view>
int main()
{
std::string_view str{ "balloon" };
// remove the "b"
str.remove_prefix(1);
// remove the "oon"
str.remove_suffix(3);
// Remember that the above doesn't modify the string, it only changes
// the region that str is observing.
std::cout << str << " has " << std::strlen(str.data()) << " letter(s)\n";
std::cout << "str is " << str << '\n';
std::cout << "str.data() is " << str.data() << '\n';
return 0;
}
all has 6 letter(s) str is all str.data() is alloon
Clearly this isn’t what we’d intended. First, we remove the prefix “b”, so str
is viewing substring “alloon”. This substring is still null-terminated, so no problems yet. Next, we remove the suffix “oon”, so str
is now viewing substring “all”. This substring is no longer null-terminated.
Thus, when we call the data()
function, it returns a pointer to the first character in the substring being viewed, which is C-style string “alloon”. Thus when std::strlen()
counts the number of characters, it returns 6
(“alloon”) instead of the expected 3
(“all”).
Warning
Only use std::string_view::data()
if the std::string_view
‘s view hasn’t been modified and the string being viewed is null-terminated. Using std::string_view::data()
of a non-null-terminated string can cause undefined behavior.
Otherwise, convert your std::string_view
to a std::string
and call std::string::data()
, which is guaranteed to be null-terminated.
Incomplete implementation
Being a relatively recent feature, std::string_view
isn’t implemented as well as it could be.
std::string s{ "hello" };
std::string_view v{ "world" };
// Doesn't work
std::cout << (s + v) << '\n';
std::cout << (v + s) << '\n';
// Potentially unsafe, or not what we want, because we're treating
// the std::string_view as a C-style string.
std::cout << (s + v.data()) << '\n';
std::cout << (v.data() + s) << '\n';
// Ok, but ugly and wasteful because we have to construct a new std::string.
std::cout << (s + std::string{ v }) << '\n';
std::cout << (std::string{ v } + s) << '\n';
std::cout << (s + static_cast<std::string>(v)) << '\n';
std::cout << (static_cast<std::string>(v) + s) << '\n';
There’s no reason why line 5 and 6 shouldn’t work. They will probably be supported in a future C++ version.