Search

9.12 — C-style string symbolic constants

C-style string symbolic constants

In a previous lesson, we discussed how you could create and initialize a C-style string, like this:

C++ also supports a way to create C-style string symbolic constants using pointers:

While these above two programs operate and produce the same results, C++ deals with the memory allocation for these slightly differently.

In the fixed array case, the program allocates memory for a fixed array of length 5, and initializes that memory with the string “Alex\0”. Because memory has been specifically allocated for the array, you’re free to alter the contents of the array. The array itself is treated as a normal local variable, so when the array goes out of scope, the memory used by the array is freed up for other uses.

In the symbolic constant case, how the compiler handles this is implementation defined. What usually happens is that the compiler places the string “Alex\0” into read-only memory somewhere, and then sets the pointer to point to it. Because this memory may be read-only, best practice is to make sure the string is const.

For optimization purposes, multiple string literals may be consolidated into a single value. For example:

These are two different string literals with the same value. The compiler may opt to combine these into a single shared string literal, with both name1 and name2 pointed at the same address. Thus, if name1 was not const, making a change to name1 could also impact name2 (which might not be expected).

As a result of string literals being stored in a fixed location in memory, string literals have static duration rather than automatic duration (that is, they die at the end of the program, not the end of the block in which they are defined). That means that when we use string literals, we don’t have to worry about scoping issues. Thus, the following is okay:

In the above code, getName() will return a pointer to C-style string “Alex”. If this function were returning any other local variable by address, the variable would be destroyed at the end of getName(), and we’d return a dangling pointer back to the caller. However, because string literals have static duration, “Alex” will not be destroyed when getName() terminates, so the caller can still successfully access it.

C-style strings are used in a lot of old or low-level code, because they have a very small memory footprint. Modern code should favor the use std::string and std::string_view, as those provide safe and easy access to the string.

std::cout and char pointers

At this point, you may have noticed something interesting about the way std::cout handles pointers of different types.

Consider the following example:

On the author’s machine, this printed:

003AF738
Hello!
Alex

Why did the int array print an address, but the character arrays printed strings?

The answer is that std::cout makes some assumptions about your intent. If you pass it a non-char pointer, it will simply print the contents of that pointer (the address that the pointer is holding). However, if you pass it an object of type char* or const char*, it will assume you’re intending to print a string. Consequently, instead of printing the pointer’s value, it will print the string being pointed to instead!

While this is great 99% of the time, it can lead to unexpected results. Consider the following case:

In this case, the programmer is intending to print the address of variable c. However, &c has type char*, so std::cout tries to print this as a string! On the author’s machine, this printed:

Q╠╠╠╠╜╡4;¿■A

Why did it do this? Well, it assumed &c (which has type char*) was a string. So it printed the ‘Q’, and then kept going. Next in memory was a bunch of garbage. Eventually, it ran into some memory holding a 0 value, which it interpreted as a null terminator, so it stopped. What you see may be different depending on what’s in memory after variable c.

This case is somewhat unlikely to occur in real-life (as you’re not likely to actually want to print memory addresses), but it is illustrative of how things work under the hood, and how programs can inadvertently go off the rails.


9.13 -- Dynamic memory allocation with new and delete
Index
9.11 -- Pointer arithmetic and array indexing

185 comments to 9.12 — C-style string symbolic constants

  • EternalSkid

    Hello alex and nascardriver, there's something i don't understand about the following code. Are you able to explain it to me? Much appreciated.

  • EternalSkid

    Hello alex and nascardriver, theres a part i do not understand about this code. Are you able to explain this to me? Much appreciated.

    • nascardriver

      `myName` points to a constant char, the pointer is non-const.

  • koe

    I'm still a bit confused by this, but it seems to have a global symbolic C-style string constant, you need to combine constexpr and const like this:

    Not including the 'constexpr' causes linker errors if your symbolic string is in a header that gets included more than once.

    I am using these constants to get directories from CMake for resource files, in a 'project_dir_config.h.in' generator header.

    • nascardriver

      The `constexpr` applies to the `char*`. This is a const pointer to a non-const `char`. It can't be used for string literals.
      Marking what applies to what with braces

      Use `std::string_view` for string literals

  • Waldo Lemmer

    Wow, this lesson was very interesting. Thanks for teaching us about the inner workings of c++ :)

    > In the lesson 6.6 -- C-style strings, we discussed how you could create and initialize a C-style string, like this:

    This link should be updated, it should be:

    Instead of

    > string literals have static duration rather than automatic duration (that is, they die at the end of the program, not the end of the block in which they are defined)

    Does this mean a program with more string literals will consume more memory, even if those string literals are only used once?

    • nascardriver

      String literals reside in the binary. The entire binary gets loaded into RAM when you run your program. Using a string literal doesn't load or copy the string literal, it's just accessing something that's already there. It doesn't matter if you access a string literal 0 or 100 times. The more string literals you have, the more memory you need to store them.

      If you have the same string literal multiple times, eg.

      It's the compiler's choice to use a single string literal or generate one for every appearance. This example could result in anything from 1 to 5 literals being stored.

  • Patrick

    So I was playing around a bit with C-style string symbolic constants.
    Just to check my understanding, for this code -

    I'm thinking that since "Hello" is of type const char [6] (as shown in Visual Studio), it's an array. Now, because of that, I'm thinking that "Hello" decays to a pointer that points to 'H', and this pointer's type would be const char*. Therefore, strPtr is a pointer that stores the pointer to 'H'. Is all of that correct?

    Alright, sorry for all the questions and this being quite long, but I'm curious about one more thing.
    At the very beginning, when you guys introduced C-style strings, the syntax looked like this:

    Based on my previous understanding of how "Hello" decays to a pointer that points to 'H', I'm thinking that the code will look this:

    which makes no sense to me. Maybe I'm misunderstanding how that syntax really works.

    • nascardriver

      `strPointer` is a pointer that holds the address of the `H` of the "Hello" array.

      `str` is an array, it only decays when you use it.

      • Patrick

        Ok, thanks for the reply. For my second question, I was thinking that the string literal "Hello" in

        decayed to the pointer that points to 'H'. Thus, if this were the case, how does C++ initialize the char array with a pointer to the first element?

        This other syntax makes complete sense to me:

  • Patrick

    For a regular C-style string such as:

    When we try to std::cout it,

    Will the string decay to the pointer of the first element, 'h'? And from there, since that pointer's type is char*, std::cout just prints the whole string?

  • J34NP3T3R

    sorry wrong comment

Leave a Comment

Put all code inside code tags: [code]your code here[/code]