11.7 — std::string_view (part 2)

Author’s note

Some of the content of this lesson was moved into the introduction to std::string_view lesson (4.18 -- Introduction to std::string_view). As a result, this lesson contains some duplicative content that has not been cleaned up yet. This will be addressed when this chapter is rewritten (soon).

In the previous lesson, we talked about C-style strings, and the dangers of using them. C-style strings are fast, but they’re not as easy to use and as safe as std::string.

But std::string (which we covered in lesson 4.17 -- Introduction to std::string), has some of its own downsides, particularly when it comes to const strings.

Consider the following example:

#include <iostream>
#include <string>

int main()
{
  char text[]{ "hello" };
  std::string str{ text };
  std::string more{ str };

  std::cout << text << ' ' << str << ' ' << more << '\n';

  return 0;
}

As expected, this prints

hello hello hello

Internally, main copies the string “hello” 3 times, resulting in 4 copies. First, there is the string literal “hello”, which is known at compile-time and stored in the binary. One copy is created when we create the char[]. The following two std::string objects create one copy of the string each.

So why does std::string make a copy of the string used to initialize it? A std::string provides some useful guarantees about the string data it manages:

  • The string data will be valid for as long as the string object is alive, and cleaned up when the string object dies.
  • The string’s value can only be modified by the string object.
  • The string’s value can be modified without affecting other objects (for non-const strings).

A std::string object has no control over what kind of string is used to initialize it (that is the caller’s responsibility) -- the caller could destroy or modify the initialization string immediately after the std::string object is initialized. Post-initialization, the string object can not rely on the initializer in any way, otherwise the above guarantees could be violated.

By making a copy of the initialization string (that only it has access to), the string object can ensure that the value and lifetime of the string data is independent from other objects (including the initialization string).

However, in some cases, we don’t need these benefits (particularly for const std::string objects, which can’t modify their value). Consider the case where we know a std::string won’t outlive it’s initialization string, and that the initialization string won’t be modified. In such cases, we’re paying a high cost (making a copy of the initialization string) for benefits we don’t need (independence from the initialization string).

Introducing std::string_view

Consider a window in your house, looking at a car sitting on the street. You can look through the window and see the car, but you can’t touch or move the car. Your window just provides a view to the car, which is a completely separate object.

C++17 introduces another way of using strings, std::string_view, which lives in the <string_view> header.

Unlike std::string, which keeps its own copy of the string, std::string_view provides a view of a string that is defined elsewhere.

We can re-write the above code to use std::string_view by replacing every std::string and C-style string with std::string_view.

#include <iostream>
#include <string_view>

int main()
{
  std::string_view text{ "hello" }; // view the text "hello", which is stored in the binary
  std::string_view str{ text }; // view of the same "hello"
  std::string_view more{ str }; // view of the same "hello"

  std::cout << text << ' ' << str << ' ' << more << '\n';

  return 0;
}

The output is the same, but no more copies of the string “hello” are created. text is only a view onto the string “hello”, so no copy has to be created. When we copy a std::string_view, the new std::string_view observes the same string as the copied-from std::string_view is observing. This means that neither str nor more create any copies of the string. They are views onto the existing string “hello”.

std::string_view is not only fast, but has many of the functions that we know from std::string.

#include <iostream>
#include <string_view>
 
int main()
{
  std::string_view str{ "Trains are fast!" };
 
  std::cout << str.length() << '\n'; // 16
  std::cout << str.substr(0, str.find(' ')) << '\n'; // Trains
  std::cout << (str == "Trains are fast!") << '\n'; // 1
 
  // Since C++20
  std::cout << str.starts_with("Boats") << '\n'; // 0
  std::cout << str.ends_with("fast!") << '\n'; // 1
 
  std::cout << str << '\n'; // Trains are fast!
 
  return 0;
}

Because std::string_view doesn’t create a copy of the string, if we change the viewed string, the changes are reflected in the std::string_view.

#include <iostream>
#include <string_view>

int main()
{
  char arr[]{ "Gold" };
  std::string_view str{ arr };

  std::cout << str << '\n'; // Gold

  // Change 'd' to 'f' in arr
  arr[3] = 'f';

  std::cout << str << '\n'; // Golf

  return 0;
}

We modified arr, but str appears to be changing as well. That’s because arr and str share their string. When you use a std::string_view, it’s best to avoid modifications to the underlying string for the remainder of the std::string_view‘s life to prevent confusion and errors.

Best practice

Prefer std::string_view over std::string and C-style strings when you only need read-only access to a string (unless you can’t guarantee the string being viewed will stay alive for the lifetime of the std::string_view, in which case you should prefer std::string).

View modification functions

Back to our window analogy, consider a window with curtains. We can close either the left or right curtain to reduce what we can see. We don’t change what’s outside, we just reduce the visible area.

Similarly, std::string_view contains functions that let us manipulate the view of the string. This allows us to change the view without modifying the viewed string.

The functions for this are remove_prefix, which removes characters from the left side of the view, and remove_suffix, which removes characters from the right side of the view.

#include <iostream>
#include <string_view>

int main()
{
  std::string_view str{ "Peach" };

  std::cout << str << '\n';

  // Ignore the first character.
  str.remove_prefix(1);

  std::cout << str << '\n';

  // Ignore the last 2 characters.
  str.remove_suffix(2);

  std::cout << str << '\n';

  return 0;
}

This program produces the following output:

Peach
each
ea

Unlike real curtains, a std::string_view cannot be opened back up. Once you shrink the area, the only way to re-widen it is to reset the view by reassigning the source string to it again.

std::string_view works with non-null-terminated strings

Unlike C-style strings and std::string, std::string_view doesn’t use null terminators to mark the end of the string. Rather, it knows where the string ends because it keeps track of its length.

#include <iostream>
#include <iterator> // For std::size
#include <string_view>

int main()
{
  // No null-terminator.
  char vowels[]{ 'a', 'e', 'i', 'o', 'u' };

  // vowels isn't null-terminated. We need to pass the length manually.
  // Because vowels is an array, we can use std::size to get its length.
  std::string_view str{ vowels, std::size(vowels) };

  std::cout << str << '\n'; // This is safe. std::cout knows how to print std::string_view.

  return 0;
}

This program prints:

aeiou

Converting a std::string_view to a C-style string

Some old functions (such as the old strlen function) still expect C-style strings. To convert a std::string_view to a C-style string, we can do so by first converting to a std::string:

#include <cstring>
#include <iostream>
#include <string>
#include <string_view>

int main()
{
  std::string_view sv{ "balloon" };

  sv.remove_suffix(3);

  // Create a std::string from the std::string_view
  std::string str{ sv };

  // Get the null-terminated C-style string.
  auto szNullTerminated{ str.c_str() };

  // Pass the null-terminated string to the function that we want to use.
  std::cout << str << " has " << std::strlen(szNullTerminated) << " letter(s)\n";

  return 0;
}

This prints:

ball has 4 letter(s)

However, creating a std::string every time we want to pass a std::string_view as a C-style string is expensive, so this should be avoided if possible.

Prefer std::string_view function parameters (over const std::string&)

One question that comes up often in modern C++: when writing a function that has a string parameter, should the type of the parameter be const std::string& or std::string_view?

In most cases, std::string_view is the better choice, as it can handle a wider range of argument types efficiently. We’ll explore why this is the case in the next section.

void doSomething(const std::string&);
void doSomething(std::string_view);   // prefer this in most cases

There are a few cases where using a const std::string& parameter may be more appropriate. First, if you’re using C++14 or older, std::string_view isn’t available. Second, if your function needs to call some other function that takes a C-style string or std::string parameter, then const std::string& may be a better choice, as std::string_view is not guaranteed to be null-terminated (something that C-style string functions expect) and does not efficiently convert back to a std::string.

Best practice

Prefer passing strings using std::string_view (by value) instead of const std::string&, unless your function calls other functions that require C-style strings or std::string parameters.

Author’s note

Many examples in future lessons were written prior to the introduction of std::string_view, and still use const std::string& for function parameters when std::string_view should be preferred. We’ve converted most of these to use std::string_view, but if you find any others, please leave a comment on the relevant lesson.

Why std::string_view parameters are more efficient than const std::string& (optional reading)

In C++, a string argument will typically be a std::string, a std::string_view, or a C-style string/string literal.

As reminders:

  • If the type of an argument does not match the type of the corresponding parameter, the compiler will try to implicitly convert the argument to match the type of the parameter.
  • Converting a value creates a temporary object of the converted type.
  • Creating (or copying) a std::string_view is inexpensive, as std::string_view does not make a copy of the string it is viewing.
  • Creating (or copying) a std::string can be expensive, as each std::string object makes a copy of the string.

Here’s a table showing what happens when we try to pass each type:

Argument Type std::string_view parameter const std::string& parameter
std::string Inexpensive conversion Inexpensive reference binding
std::string_view Inexpensive copy Won’t implicitly convert
Expensive explicit conversion to std::string
C-style string / literal Inexpensive conversion Expensive conversion

With a std::string_view value parameter:

  • If we pass in a std::string argument, the compiler will convert the std::string to a std::string_view, which is inexpensive, so this is fine.
  • If we pass in a std::string_view argument, the compiler will copy the argument into the parameter, which is inexpensive, so this is fine.
  • If we pass in a C-style string or string literal, the compiler will convert these to a std::string_view, which is inexpensive, so this is fine.

As you can see, std::string_view handles all three cases inexpensively.

With a const std::string& reference parameter:

  • If we pass in a std::string argument, the parameter will reference bind to the argument, which is inexpensive, so this is fine.
  • If we pass in a std::string_view argument, the compiler will refuse to do an implicit conversion, and produce a compilation error. We can use static_cast to do an explicit conversion (to std::string), but this conversion is expensive (since std::string will make a copy of the string being viewed). Once the conversion is done, the parameter will reference bind to the result, which is inexpensive. But we’ve made an expensive copy to do the conversion, so this isn’t great.
  • If we pass in a C-style string or string literal, the compiler will implicitly convert this to a std::string, which is expensive. So this isn’t great either.

Thus, a const std::string& parameter only handles std::string arguments inexpensively.

The same, in code form:

#include <iostream>
#include <string>
#include <string_view>

void printSV(std::string_view sv)
{
    std::cout << sv << '\n';
}

void printS(const std::string& s)
{
    std::cout << s << '\n';
}

int main()
{
    std::string s{ "Hello, world" };
    std::string_view sv { s };

    // Pass to `std::string_view` parameter
    printSV(s);              // ok: inexpensive conversion from std::string to std::string_view
    printSV(sv);             // ok: inexpensive copy of std::string_view
    printSV("Hello, world"); // ok: inexpensive conversion of C-style string literal to std::string_view

    // pass to `const std::string&` parameter
    printS(s);              // ok: inexpensive bind to std::string argument
    printS(sv);             // compile error: cannot implicit convert std::string_view to std::string
    printS(static_cast<std::string>(sv)); // bad: expensive creation of std::string temporary
    printS("Hello, world"); // bad: expensive creation of std::string temporary

    return 0;
}

Additionally, we need to consider the cost of accessing the parameter inside the function. Because a std::string_view parameter is a normal object, it can be accessed directly. Accessing a std::string& parameter is more expensive because the reference must be dereferenced to get to the object before the object can be accessed.

Ownership issues

A std::string_view‘s lifetime is independent of that of the string it is viewing (meaning the string being viewed can be destroyed before the std::string_view object). If this happens, then accessing the std::string_view will cause undefined behavior.

The string that a std::string_view is viewing has to have been created somewhere else. It might be a string literal that lives as long as the program does, or a std::string, in which case the string lives until the std::string decides to destroy it or the std::string dies.

std::string_view can’t create any strings on its own, because it’s just a view.

Here’s an example of a program that has an ownership issue:

#include <iostream>
#include <string>
#include <string_view>

std::string_view askForName()
{
  std::cout << "What's your name?\n";

  // Use a std::string, because std::cin needs to modify it.
  std::string name{};
  std::cin >> name;

  // We're switching to std::string_view for demonstrative purposes only.
  // If you already have a std::string, there's no reason to switch to
  // a std::string_view.
  std::string_view view{ name };

  std::cout << "Hello " << view << '\n';

  return view;
} // name dies, and so does the string that name created.

int main()
{
  std::string_view view{ askForName() };

  // view is observing a string that already died.
  std::cout << "Your name is " << view << '\n'; // Undefined behavior

  return 0;
}
What's your name?
nascardriver
Hello nascardriver
Your name is �[email protected][email protected]

In function askForName(), we create name and fill it with data from std::cin. Then we create view, which can view that string. At the end of the function, we return view, but the string it is viewing (name) is destroyed, so view is now pointing to deallocated memory. The function returns a dangling std::string_view.

Accessing the returned std::string_view in main causes undefined behavior, which on the author’s machine produced weird characters.

The same can happen when we create a std::string_view from a std::string and then modify the std::string. Modifying a std::string can cause its internal string to die and be replaced with a new one in a different place. The std::string_view will still look at where the old string was, but it’s not there anymore.

Warning

Make sure that the underlying string viewed with a std::string_view does not go out of scope and isn’t modified while using the std::string_view.

The data() function and non-null-terminated strings

The string being viewed by a std::string_view can be accessed by using the data() function, which returns a C-style string. This provides fast access to the string being viewed (as a C-string). But it should only be used in cases where we know the string being viewed is null-terminated.

In the following example, std::strlen expects a C-style string, so we need to pass it str.data():

#include <cstring> // For std::strlen
#include <iostream>
#include <string_view>

int main()
{
  std::string_view str{ "balloon" };

  std::cout << str << '\n';

  // We use std::strlen because it's simple, this could be any other function
  // that needs a null-terminated string.
  // It's okay to use data() because we haven't modified the view, and the
  // string is null-terminated.
  std::cout << std::strlen(str.data()) << '\n';

  return 0;
}
balloon
7

In the next example, we’ll show a case where the std::string_view is modified so that it is no longer viewing a null-terminated string:

#include <cstring>
#include <iostream>
#include <string_view>

int main()
{
  std::string_view str{ "balloon" };


  // remove the "b"
  str.remove_prefix(1);
  // remove the "oon"
  str.remove_suffix(3);
  // Remember that the above doesn't modify the string, it only changes
  // the region that str is observing.

  std::cout << str << " has " << std::strlen(str.data()) << " letter(s)\n";
  std::cout << "str is " << str << '\n';
  std::cout << "str.data() is " << str.data() << '\n';

  return 0;
}
all has 6 letter(s)
str is all
str.data() is alloon

Clearly this isn’t what we’d intended. First, we remove the prefix “b”, so str is viewing substring “alloon”. This substring is still null-terminated, so no problems yet. Next, we remove the suffix “oon”, so str is now viewing substring “all”. This substring is no longer null-terminated.

Thus, when we call the data() function, it returns a pointer to the first character in the substring being viewed, which is C-style string “alloon”. Thus when std::strlen() counts the number of characters, it returns 6 (“alloon”) instead of the expected 3 (“all”).

Warning

Only use std::string_view::data() if the std::string_view‘s view hasn’t been modified and the string being viewed is null-terminated. Using std::string_view::data() of a non-null-terminated string can cause undefined behavior.

Otherwise, convert your std::string_view to a std::string and call std::string::data(), which is guaranteed to be null-terminated.

Incomplete implementation

Being a relatively recent feature, std::string_view isn’t implemented as well as it could be.

std::string s{ "hello" };
std::string_view v{ "world" };

// Doesn't work
std::cout << (s + v) << '\n';
std::cout << (v + s) << '\n';

// Potentially unsafe, or not what we want, because we're treating
// the std::string_view as a C-style string.
std::cout << (s + v.data()) << '\n';
std::cout << (v.data() + s) << '\n';

// Ok, but ugly and wasteful because we have to construct a new std::string.
std::cout << (s + std::string{ v }) << '\n';
std::cout << (std::string{ v } + s) << '\n';
std::cout << (s + static_cast<std::string>(v)) << '\n';
std::cout << (static_cast<std::string>(v) + s) << '\n';

There’s no reason why line 5 and 6 shouldn’t work. They will probably be supported in a future C++ version.

guest
Your email address will not be displayed
Correction-related comments will be deleted after processing to help reduce clutter. Thanks for helping to make the site better for everyone!
Avatars from https://gravatar.com/ are connected to your provided email address.
Notify me about replies:  
58 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments