Search

6.6 — C-style strings

In lesson 4.4b -- An introduction to std::string, we defined a string as a collection of sequential characters, such as “Hello, world!”. Strings are the primary way in which we work with text in C++, and std::string makes working with strings in C++ easy.

Modern C++ supports two different types of strings: std::string (as part of the standard library), and C-style strings (natively, as inherited from the C language). It turns out that std::string is implemented using C-style strings. In this lesson, we’ll take a closer look at C-style strings.

C-style strings

A C-style string is simply an array of characters that uses a null terminator. A null terminator is a special character (‘\0’, ascii code 0) used to indicate the end of the string. More generically, A C-style string is called a null-terminated string.

To define a C-style string, simply declare a char array and initialize it with a string literal:

Although “string” only has 6 letters, C++ automatically adds a null terminator to the end of the string for us (we don’t need to include it ourselves). Consequently, myString is actually an array of length 7!

We can see the evidence of this in the following program, which prints out the length of the string, and then the ASCII values of all of the characters:

This produces the result:

string has 7 characters.
115 116 114 105 110 103 0

That 0 is the ASCII code of the null terminator that has been appended to the end of the string.

When declaring strings in this manner, it is a good idea to use [] and let the compiler calculate the length of the array. That way if you change the string later, you won’t have to manually adjust the array length.

One important point to note is that C-style strings follow all the same rules as arrays. This means you can initialize the string upon creation, but you can not assign values to it using the assignment operator after that!

Since C-style strings are arrays, you can use the [] operator to change individual characters in the string:

This program prints:

spring

When printing a C-style string, std::cout prints characters until it encounters the null terminator. If you accidentally overwrite the null terminator in a string (e.g. by assigning something to myString[6]), you’ll not only get all the characters in the string, but std::cout will just keep printing everything in adjacent memory slots until it happens to hit a 0!

Note that it’s fine if the array is larger than the string it contains:

In this case, the string “Alex” will be printed, and std::cout will stop at the null terminator. The rest of the characters in the array are ignored.

C-style strings and std::cin

There are many cases where we don’t know in advance how long our string is going to be. For example, consider the problem of writing a program where we need to ask the user to enter their name. How long is their name? We don’t know until they enter it!

In this case, we can declare an array larger than we need:

In the above program, we’ve allocated an array of 255 characters to name, guessing that the user will not enter this many characters. Although this is commonly seen in C/C++ programming, it is poor programming practice, because nothing is stopping the user from entering more than 255 characters (either unintentionally, or maliciously).

The recommended way of reading strings using cin is as follows:

This call to cin.getline() will read up to 254 characters into name (leaving room for the null terminator!). Any excess characters will be discarded. In this way, we guarantee that we will not overflow the array!

Manipulating C-style strings

C++ provides many functions to manipulate C-style strings as part of the <cstring> library. Here are a few of the most useful:

strcpy() allows you to copy a string to another string. More commonly, this is used to assign a value to a string:

However, strcpy() can easily cause array overflows if you’re not careful! In the following program, dest isn’t big enough to hold the entire string, so array overflow results.

In C++11, strcpy() was deprecated in favor of strcpy_s, which adds a new parameter to define the size of the destination. However, not all compilers support this function, and to use it, you have to define __STDC_WANT_LIB_EXT1__ with integer value 1. If your compiler doesn’t support strcpy_s, you can still use strcpy even though it’s deprecated.

Another useful function is the strlen() function, which returns the length of the C-style string (without the null terminator).

The above example prints:

My name is: Alex
Alex has 4 letters.
Alex has 20 characters in the array.

Note the difference between strlen() and sizeof(). strlen() prints the number of characters before the null terminator, whereas sizeof() returns the size of the entire array, regardless of what’s in it.

Other useful functions:
strcat() -- Appends one string to another (dangerous)
strncat() -- Appends one string to another (with buffer length check)
strcmp() -- Compare two strings (returns 0 if equal)
strncmp() -- Compare two strings up to a specific number of characters (returns 0 if equal)

Here’s an example program using some of the concepts in this lesson:

Don’t use C-style strings

It is important to know about C-style strings because they are used in a lot of code. However, now that we’ve explained how they work, we’re going to recommend that you avoid them altogether whenever possible! Unless you have a specific, compelling reason to use C-style strings, use std::string (defined in the <string> header) instead. std::string is easier, safer, and more flexible.

Rule: Use std::string instead of C-style string

6.7 -- Introduction to pointers
Index
6.5 -- Multidimensional Arrays

115 comments to 6.6 — C-style strings

  • Hi Alex!

    About @strcpy_s. It's not a standard feature of C++ nor is @std::strcpy deprecated. Visual Studio might complain about it's use, but that's not reason enough to drop cross-compiler compatibility. I think the mention of @strcpy_s should be removed in favor of @std::strncpy, which is not implementation specific and offers at least some of the security measurement of @strcpy_s.
    I saw you posting about using @std::string rather than using C-String manipulation. Please don't remove the paragraph in question, C-String manipulation is still relevant.

    References
    @std::strncpy is standard - N4762 § 20.5.3
    @strcpy_s is implementation-defined - N4762 § 15.5.1.2 (9)

    • Alex

      Thanks for the feedback as always. I saw that Visual Studio had marked std::strcpy as deprecated and assumed it was -- but it turns out this is a Visual Studio specific thing. Thanks Microsoft!

      I'll add this to my to-do to come back and clean up. This article is overdue for a revision anyway.

  • Kio

    Maybe worth adding @Alex. Using strncpy_s (we need to have destination array, which will have enough space for "copied characters" + '\n')

  • Nguyen

    In the above program, we’ve allocated an array of 255 characters to name, guessing that the user will not enter this many characters.

    Even if the user does not enter this many characters, it can produce undesirable output.  For example, I enter: AAA Nguyen (less than 255 characters), it will print out AAA.

    • Hi Nguyen!

      That's the expected behavior.
      For one, you're not initializing @name, so whether or not there is anything more to output is undefined. Furthermore, @std::basic_istream::operator>> inserts a null terminator at name[3] (for input "AAA"), which tells @std::basic_ostream::operator<< to stop printing.
      If you want to print the full array you need to fill it with some data and override the null terminator.

      It'd be better to call @std::fill after we know how long the string input by the user is. But I feel like this is easier to understand.

  • Nguyen

    Hi

    "If you accidentally overwrite the null terminator in a string, you’ll not only get all the characters in the string, but std::cout will just keep printing everything in adjacent memory slots until it happens to hit a 0!"

    I wanted to see what std::cout would just keep printing everything in adjacent memory slots....

    Output:

    Enter your name: Nguyen
    You entered: Nguyen
    Press any key to continue . . .

    OK, I got all the characters I had entered but there was nothing in adjacent memory slots.  Could you please explain?

    Thanks, have a great day.

    • Hi Nguyen!

      @name contained "Ng". "uyen" was in trailing memory which should've been inaccessible. @std::basic_istream::operator>> places a null-terminator, which you didn't override.

      When built and ran in debug mode, this should print

  • seriouslysupersonic

    Hi!

    Congratulations for the very well-written tutorials!
    I have a few questions regarding overflowing char arrays with strcpy. I am using g++ version 7.3.0 with -std=c++17 (-Wall -Wextra -Werror -pedantic) and the following program

    results in

    source        : supercalifragilisticexpialidocious

    strcpy()
    source        : ragiliagilisticexpialidocious
    destination    : superragiliagilisticexpialidocious

    While debugging
    @line 7  - (gdb) x source      > 0x7fff2a01f970: 0x65707573
    @line 12 - (gdb) x destination    > 0x7fff2a01f96b: 0x00555d89
    @line 15 - (gdb) x source      > 0x7fff2a01f970: 0x69676172
    @line 15 - (gdb) x destination    > 0x7fff2a01f970: 0x65707573

    1) So I assume strcpy is changing destination to &source[0]. What I am not sure is why is the destination output correct at the end. Is it because std::cout will only stop outputting when it encounters '\0', and because @source is null terminated and strcpy only changed where @destination is pointing to, it will eventually hit the region in memory where '\0' was written to when initializing @source?

    2) If this is the case, the @destination size should still be @tinyBuffer but destination[tinyBuffer - 1] == 'r'?

    3) Why does &destination[0] change?

    • Hi!

      > strcpy
      Use @std::strcpy. @strcpy isn't guaranteed to be declared in the global namespace.

      > So I assume strcpy is changing destination to &source[0]
      No, @std::strcpy is not copying pointers, it's copying the stored chars.
      Also,
      &source[0] == &(source[0]) == source

      > Is it because std::cout will only stop outputting when it encounters '\0'
      Yes. Either that or your program crashes, because @std::cout is trying to access invalid memory.

      > strcpy only changed where @destination is pointing to
      No.

      2) @tinyBuffer is the size of the memory that has been reserved for @destination. This doesn't necessarily mean that there's no memory behind it. But the memory behind it isn't owned by @tinyBuffer, you're overriding memory which is not supposed to be accessible through @destination. In your case, the memory behind @destination is memory owned by @source. When you printed @source after the call to @std::strcpy, @source started with "califrag[...]". The first 5 characters went missing. That's because @destination was located in memory before @source and the "califrag[...]" you're seeing is the "califrag[..]" that's part of @destination.

      3) I don't know what gdb did there. The addresses @source and @destination are pointing to don't change. You can try this by printing them from the program itself.

  • Jack

    I've always seen

    as opposed to

    What are the differences between the two?

    Also, when concatenating strings or copying strings, would the following be a safe method? I was thinking about it and wanted a way so that if I change the size of dest (see below code) than the remaining room is automatically updated, meaning I do not have to change it everywhere. The below works, but is it good practice?

    Changing dest to be a larger value (12, in the below case) means more characters can be concatenated to dest:

    Changing dest so that it only has room for "Hello, " plus one extra character (terminating character) means that nothing is concatenated to the string:

    • Alex

      strncpy copies n chars from the source into the destination. strcpy_s copies all characters (including the null terminator) from the source into the destination (the size here is used to indicate the size of the destination, not the number of chars to copy).

      But really, you shouldn't be manipulating C-style strings at all -- if you need to do string manipulation, use std::string. It's much safer.

  • ApoRed

    Can you declare a C-Style string like this?

  • Pashka2107

    Hi Alex,
    I have one question.
    Consider following lines:

    I have written a function, which transform one char* into another one(the main point is that this char* changes). When I declared str, like in first line, and passed it in that function, I've got access violation error. It confused me a lot. I tried then to declare str like in second line, and it compiled just fine!

    It looks like I've violated constness of something, but I have no idea, constness of what. So
    where is the difference between this line

    and this one?

    • Alex

      char *str is a pointer to a string literal. String literals are generally treated as const, and are often placed in read-only memory. If you try to change one of these, you'll likely get an access violation.
      char str[] = "string" is a char array in normal memory that is initialized with the string "string". Because this resides in normal memory, you can modify it as you see fit.

  • Tommy Gaudreau

    In this section you're using sizeof to get the length of an array, but isn't sizeof supposed to return the size of the array instead? weren't we supposed to do "sizeof(array) / sizeof(array[0])"?

    • Alex

      Yes, but this array is a char array, and sizeof(char) is guaranteed to be 1. Therefore, sizeof(array[0]) must be 1 as well, making the divisor of the equation unnecessary. Put another way, with char arrays, the size and the length are identical.

      However, this isn't obvious, and it would cause the code to break if we ever changed the array type, so I've added the divisor in as a good defensive coding mechanism.

  • Lamont Peterson

    Alex,

    I've found several references online which state that strcpy_s () (and all the other *_s () functions) are a Microsoft only extension and are not available anywhere else.  Indeed, many people seem to be saying that it's a bad idea to use them, period, though, I rather think that statement to be a bit harsh since it doesn't include a proper context.  I think it should be, "It is un-wise to use strcpy_s () or other *_s () functions as their use is a non-portable practice."

    I wanted to share this with you, since throughout your lessons, you have done a really good job of keeping things portable and I felt that you would want to know and perhaps do something different in there about that.

  • Max Red

    Here:

    # One important point to note is that C-style strings follow all the same rules as arrays. This means you can
    # initialize the string upon creation, but you can not assign values to it using the assignment operator after
    # that!
    #
    # char myString[] = "string"; // ok
    # myString = "rope"; // not ok!
    #
    # This would be the conceptual equivalent of the following nonsensical example:
    #
    # int array[] = { 3, 5, 7, 9 }; // ok
    # array = 8; // what does this mean?

    I think that either the last line should be:

    # array = {8}

    To make the two lines be "conceptually equivalent".

    • Alex

      Agreed, and upon reflection, I've removed the paragraph, because with std::array and std::vector, you can assign values to arrays just like that and it works fine... I appreciate the feedback.

  • Khang

    Hi Alex
    Shouldn't mystring be replace with myString? I feel it will be better with the suggestion in 1.10b

  • Nguyen

    Hi Alex,

    Use std::getline() to input text

    To read a full line of input into a string, we’re better off using the std::getline() function instead. std::getline() takes two parameters: the first is std::cin, and the second is our string variable.

    Here’s the same program as above using

    std::cout << "Enter your full name: ";
    std::string name;
    std::getline(std::cin, name); // read a full line of text into name
    std::cout << "Your name is " << name;

    Looks like we don't have to use std::getline() to read a full line of text when using C-style string???

    Thanks, Have a great day

  • antiriad7

    Hello, I have several questions.
    1)Is there any differences between C-style strings and char arrays? If not, can I use these <cstring> functions on other type arrays?
    2)"When declaring strings in this manner, it is a good idea to use [] and let the compiler calculate the length of the array. That way if you change the string later, you won’t have to manually adjust the array length." Does this mean that a C-style string is a dynamic width array?
    3)In this lesson, we didn't cover the syntax of strcpy_s. Are we gonna cover it later?
    4)"This call to cin.getline() will read up to 254 characters into name (leaving room for the null terminator!). Any excess characters will be discarded. In this way, we guarantee that we will not overflow the array!" Will these characters be discarded or buffered?
    5)What is syntax of getline()? We used it in different ways with different parameter types.

    Thank you for amazing tutorials once again!

    • Alex

      1) C-style strings are null terminated, char arrays are not necessarily. You can't use the cstring functions on other types of arrays as far as I'm aware.
      2) No, it's still a fixed width array, you're just enlisting the compiler's help to determine how long the array should be. A dynamic array would be one whose length is determined at runtime.
      3) I don't think I have a tutorial on this, because you're better off not using these functions at all (and using std::string instead, which I do cover in a lot more detail)
      4) I believe it actually buffers excess chars in the stream so you can extract them later. I'll update the text.
      5) There are two different getline functions, one part of std:: and one part of std::istream (accessible via std::cin). It's confusing, especially since we haven't covered classes at this point in the tutorials.

  • Hi again Quick question about your example program which counts the number of spaces, given the code

    I get this compiler warning

    could this just be because in Visual Studio 2017 C style strings are generally a bad idea or, have I missed something obvious when I was typing in the code

    • Alex

      strlen returns an unsigned integer, but your loop variable is signed. The compiler is warning you about the mismatch. Since buffer is limited to 255 chars, this shouldn't be a problem, but if you want to make the warning go away, make index an unsigned int.

      • Jon uk

        Ah that makes sense now, thank you, I should've looked up the full description for strlen, I try to get rid of errors and warnings straight away because I have found that in doing so, I learn a little bit more about the programming language.

  • jenifer

    Output:
    hi

    Output:
    hi\0Alex
    hi\0Alex
    I expected the second code snippet to print "hi".

    • Alex

      In the top section, the characters '\0' are being interpreted by the compiler as the single character with ASCII code 0. This null terminates the string at this point, so nothing after that character prints.

      With scanf, the '\' and '0' characters are being treated literally, as separate characters. So they print literally.

  • C++

    What do u mean by "an assert" in the line "an assert will be thrown in debug mode"?
    Is It warning or something else?

  • Alex

    So I accidentally used the wrong format for strcpy_s and did not include the size of the destination. My code still copied the entirety of "Copy this!" It looks like the size is optional. It looks like the compiler is it going off of the size 50 I gave it above since the string will overflow if I reduce the size in the declaration.

    • Alex

      Side-note, I realized I included <string> instead of <cstring>. It looks like strcpy_s() actually lives in the <iostream> header!

    • Alex

      Another note, strcpy_s() will shoot me an error message if I do not define a size in the declaration (ie. char dest[ ]). Since that's the case, what is the use of including the destination size?

      It doesn't seem to make any difference. The above code will place "Copy this!" in a size 50 array. If I change that to a small number like 1, and then include a destination size of 1, my program crashes and if I do the following, it still crashes as it'll continue to take the size of 1 instead of 50. Maybe I'm just unclear as to what that middle integer does in strcpy_s().

      • Alex

        The middle argument to strcpy_s is the size of the destination buffer. strcpy_s won't copy more than this even if the source string is longer, to prevent buffer overflow.

  • Soumil

    Hi!

    This gives the following compilation error:
    prog.cpp: In function ‘int main()’:
    prog.cpp:7:29: error: ‘strcpy_s’ was not declared in this scope
         strcpy_s(dest, 5, source); // An assert will be thrown in debug mode

    even though I'm using a compiler following C++14 guidelines. Help!

    • Alex

      It looks like some compilers require you to define __STDC_WANT_LIB_EXT1__ with integer value 1 to use strcpy_s. Other compilers simply don't support it. I've updated the lesson to indicate this.

  • Michael

    Hello.
    You said: "In the above program, we’ve allocated an array of 255 characters to name, guessing that the user will not enter this many characters. Although this is commonly seen in C/C++ programming, it is poor programming practice, because nothing is stopping the user from entering more than 255 characters (either unintentionally, or maliciously)."

    So we've found a solution not to overflow and to force the user to insert a string with MAX LENGHT = 255

    My question is: is there a way to give our user no-characters limit? I mean, how many he wants!

    Thanks in advance!

    • Alex

      There are ways to do this, but most of the ways to do this are a pain. Much easier is to use std::string, which will handle this kind of thing for you (and resize themselves to hold whatever the user enters).

      • Michael

        Thanks for the reply.
        I know I can do it with std::string, but I need to be addressed to a way to do it from 'scratch' since I'm making my own string class and I'm struggling with the operator>> overload!

        • Alex

          One way: Start with a small dynamically allocated char array of size 255. If the user fills the array, then allocate a larger array and copy the data from the smaller array into the larger array. Do this as many times as necessary. Then allocate a string of exactly the correct length, and copy the string into that.

          There may be a smarter way to do this via the C++ streams, but C++ I/O isn't my specialty.

  • Luat

    If I'm not wrong, at line 8 and 9 of the last piece of code (which counts the number of space character in string entered by user), it should be "std::cout" (line 8) and "std::cin" (line 9) instead of "cout" and "cin".

  • Avencherus

    Small typo in the middle.  This should be "string" rather than "spring".  X)

    This program prints:

    spring

    • Alex

      No, it's correct as written.

      This line:

      overwrite the 't' with a 'p', changing "string" to "spring".

Leave a Comment

Put all code inside code tags: [code]your code here[/code]