22.1 — std::string and std::wstring

The standard library contains many useful classes -- but perhaps the most useful is std::string. std::string (and std::wstring) is a string class that provides many operations to assign, compare, and modify strings. In this chapter, we’ll look into these string classes in depth.

Note: C-style strings will be referred to as “C-style strings”, whereas std::string (and std::wstring) will be referred to simply as “strings”.

Motivation for a string class

In a previous lesson, we covered C-style strings, which uses char arrays to store a string of characters. If you’ve tried to do anything with C-style strings, you’ll very quickly come to the conclusion that they are a pain to work with, easy to mess up, and hard to debug.

C-style strings have many shortcomings, primarily revolving around the fact that you have to do all the memory management yourself. For example, if you want to assign the string “hello!” into a buffer, you have to first dynamically allocate a buffer of the correct length:

Don’t forget to account for an extra character for the null terminator!

Then you have to actually copy the value in:

Hopefully you made your buffer large enough so there’s no buffer overflow!

And of course, because the string is dynamically allocated, you have to remember to deallocate it properly when you’re done with it:

Don’t forget to use array delete instead of normal delete!

Furthermore, many of the intuitive operators that C provides to work with numbers, such as assignment and comparisons, simply don’t work with C-style strings. Sometimes these will appear to work but actually produce incorrect results -- for example, comparing two C-style strings using == will actually do a pointer comparison, not a string comparison. Assigning one C-style string to another using operator= will appear to work at first, but is actually doing a pointer copy (shallow copy), which is not generally what you want. These kinds of things can lead to program crashes that are very hard to find and debug!

The bottom line is that working with C-style strings requires remembering a lot of nit-picky rules about what is safe/unsafe, memorizing a bunch of functions that have funny names like strcat() and strcmp() instead of using intuitive operators, and doing lots of manual memory management.

Fortunately, C++ and the standard library provide a much better way to deal with strings: the std::string and std::wstring classes. By making use of C++ concepts such as constructors, destructors, and operator overloading, std::string allows you to create and manipulate strings in an intuitive and safe manner! No more memory management, no more weird function names, and a much reduced potential for disaster.

Sign me up!

String overview

All string functionality in the standard library lives in the <string> header file. To use it, simply include the string header:

There are actually 3 different string classes in the string header. The first is a templated base class named basic_string<>:

You won’t be working with this class directly, so don’t worry about what traits or an Allocator is for the time being. The default values will suffice in almost every imaginable case.

There are two flavors of basic_string<> provided by the standard library:

These are the two classes that you will actually use. std::string is used for standard ascii and utf-8 strings. std::wstring is used for wide-character/unicode (utf-16) strings. There is no built-in class for utf-32 strings (though you should be able to extend your own from basic_string<> if you need one).

Although you will directly use std::string and std::wstring, all of the string functionality is implemented in the basic_string<> class. String and wstring are able to access that functionality directly by virtue of being templated. Consequently, all of the functions presented will work for both string and wstring. However, because basic_string is a templated class, it also means the compiler will produce horrible looking template errors when you do something syntactically incorrect with a string or wstring. Don’t be intimidated by these errors; they look far worse than they are!

Here’s a list of all the functions in the string class. Most of these functions have multiple flavors to handle different types of inputs, which we will cover in more depth in the next lessons.

Function Effect
Creation and destruction
Create or copy a string
Destroy a string
Size and capacity
length(), size()
Returns the number of characters that can be held without reallocation
Returns a boolean indicating whether the string is empty
Returns the number of characters in string
Returns the maximum string size that can be allocated
Expand or shrink the capacity of the string
Element access
[], at() Accesses the character at a particular index
=, assign()
+=, append(), push_back()
Assigns a new value to the string
Concatenates characters to end of the string
Inserts characters at an arbitrary index in string
Delete all characters in the string
Erase characters at an arbitrary index in string
Replace characters at an arbitrary index with other characters
Expand or shrink the string (truncates or adds characters at end of string)
Swaps the value of two strings
Input and Output
>>, getline()
Reads values from the input stream into the string
Writes string value to the output stream
Returns the contents of the string as a NULL-terminated C-style string
Copies contents (not NULL-terminated) to a character array
Same as c_str(). The non-const overload allows writing to the returned string.
String comparison
==, !=
<, <=, > >=
Compares whether two strings are equal/unequal (returns bool)
Compares whether two strings are less than / greater than each other (returns bool)
Compares whether two strings are equal/unequal (returns -1, 0, or 1)
Substrings and concatenation
Concatenates two strings
Returns a substring
Find index of first character/substring
Find index of first character from a set of characters
Find index of first character not from a set of characters
Find index of last character from a set of characters
Find index of last character not from a set of characters
Find index of last character/substring

Iterator and allocator support
begin(), end()
rbegin(), rend()
Forward-direction iterator support for beginning/end of string
Returns the allocator
Reverse-direction iterator support for beginning/end of string

Note: The above table will look funny if your browser is too narrow

While the standard library string classes provide a lot of functionality, there are a few notable omissions:

  • Regular expression support
  • Constructors for creating strings from numbers
  • Capitalization / upper case / lower case functions
  • Case-insensitive comparisons
  • Tokenization / splitting string into array
  • Easy functions for getting the left or right hand portion of string
  • Whitespace trimming
  • Formatting a string sprintf style
  • Conversion from utf-8 to utf-16 or vice-versa

For most of these, you will have to either write your own functions, or convert your string to a C-style string (using c_str()) and use the C functions that offer this functionality.

In the next lessons, we will look at the various functions of the string class in more depth. Although we will use string for our examples, everything is equally applicable to wstring.

22.2 -- std::string construction and destruction
21.4 -- STL algorithms overview

92 comments to 22.1 — std::string and std::wstring

  • yeokaiwei

    Hi Alex,
    I've come across libraries that use strcpy and I've read that string copy is bad.

    "Second, they could exploit a strcpy bug in the car’s Bluetooth interface provided to support hands-free dialing. Exploiting this bug requires pairing an attacking device to the car, a process that can be brute-forced at the rate of eight to nine PINs per minute. "

    How do we replace/rewrite library code that uses strcpy?

  • yeokaiwei

    1. Feedback on Chapter Ordering
    I'm a little confused over the ordering of Chapters.

    I've done the course in sequential order from Chapter 1 to 21 and I think we start using these functions even before they are explained here.

    It seems like a good idea to read the explainations/definitions before using the functions.

    It explains the benefits of C++ std::string vs C-style strings.

  • pooyan

    I think gcc on Linux uses UTF32BE for wchar_t and thus std::wstring.
    in windows UTF16LE is used for wchar_t and thus std::wstring.
    implementation is allowed to encode wide characters in any encoding.
    thanks for great tutorial

  • Jim

    I need to know object-oriented programming in order to use the std::string functions?And in General what chapters should be skipped to learn only traditional programming?

  • Thank you

    Hi nascardriver! hope you're having a great day.
    can you explain how the following line works?
    I didn't see anything like this in previous tutorials on templates before

    • nascardriver

      Template parameters can have default values

  • Ayrton Fithiadi Sedjati

    Hi Alex (or nascardriver),

    Just below the first paragraph you wrote:

    "Note: C-style strings will be referred to as “C-style strings”, whereas std::string*s* (and std::wstring) will be referred to simply as “strings”.

    Is the "std::string*s*" a typo or is it intentional?

  • very informative thank you bro

  • Louis Cloete

    After reading this chapter and searching online, I got the idea that std::basic_string doesn't inherit from std::vector. Any reason why not? I would've thought this was an ideal case of vector specialization.

    • Louis Cloete

      I am going to answer myself: std::basic_string shouldn't inherit std::vector; it should have a std::vector member if it has anything to do with vectors and wrap thinly around it for similar methods. It would seem, however, that all the functionality similar to std::vector is reimplemented specifically, so I would guess that there is performance benefits to be had from tailoring the std::basic_string implementation specifically and not inheriting std::vector

  • Louis Cloete

    Hi Alex!

    Two things:

    1) ASCII and UTF-8 is not synonyms. ASCII has only 127 code points. UTF-8 can encode the full Unicode range of more than 100 000 characters. UTF-8 just drops between 1 and 3 bytes from the char if it is 0x00 (or something like that). So a Unicode char needing four bytes to represent is represented by four bytes in a UTF-8 string. A Unicode char needing 1 byte only uses 1 byte in a UTF-8 string. The thing is, a UTF-8 string can contain any Unicode char. The char width just isn't the same for every char.

    2) Some methods/functions in the table are missing parenthesis after them (c_str() and all in the "searching" section).

  • Constructor

    Wait, we dont *have* to dynamically allocate a C-style string right?
    (as this chapter complains about C-style strings' memory management)

    We can just use a static one \/

    Or they have to be memory managed as well? (somehow for some reason not mentioned before)
    Like, im a bit confused

  • they’ve been unbelievably useful for the app I’m working on at the moment. Just one thing I noticed, the Function/Effect tables Size and Capacity section seems to be a little muddled up.

  • sam

    Typo: "which using char arrays to store" --> "which uses char arrays to store"

  • XnossisX

    I have been trying out some code that tries to find the spot where a sequence of spaces stop.
    I've tried to use find_first_not_of(' '), but when I run the program, it output the 32-bit limit, and I don't know why.
    Specifically, I've been using:
    int x=something.std::string::find_first_not_of(' ');

    • Hi XnossisX!

      The following code prints
      First letter is at 3

      • XnossisX

        I tried that, but it still outputted the size_t limit.
        Does the space character work when you're inputting a string from the user?
        I'm also using Visual Studio to test this.

        • If you're getting the input string via @std::cin::operator>>, leading whitespace will be skipped. @std::string::find_first_not_of should return 0 in that case. std::string::npos should only be returned if there are no characters that _don't_ match the searched sequence.
          Print you string to the console after reading it to make sure input is working correctly. If you're still having trouble, please post all the relevant code.

          • XnossisX

            I'm getting it through std::getline(std::cin,str) and the input does work correctly.
            The code is:

            std::string str;
            long index = 0;
            char thing() {
                return str[index];
            void dashSpace() {
                while (thing()==' ')

            • The following code works. Compare it to what you have to find out what's wrong.

              2 sample runs

              Unless you post your code (ready to compile) there's nothing more I can do for you.

  • R310

    I have a question about substr.
    You say i returns a substring. But what do the figures inside the parenthesis mean?

Leave a Comment

Put all code inside code tags: [code]your code here[/code]