Search

6.8b — C-style string symbolic constants

C-style string symbolic constants

In the lesson 6.6 -- C-style strings, we discussed how you could create and initialize a C-style string, like this:

C++ also supports a way to create C-style string symbolic constants using pointers:

While these above two programs operate and produce the same results, C++ deals with the memory allocation for these slightly differently.

In the fixed array case, the program allocates memory for a fixed array of length 5, and initializes that memory with the string “Alex\0”. Because memory has been specifically allocated for the array, you’re free to alter the contents of the array. The array itself is treated as a normal local variable, so when the array goes out of scope, the memory used by the array is freed up for other uses.

In the symbolic constant case, how the compiler handles this is implementation defined. What usually happens is that the compiler places the string “Alex\0” into read-only memory somewhere, and then sets the pointer to point to it. Because this memory may be read-only, best practice is to make sure the string is const.

For optimization purposes, multiple string literals may be consolidated into a single value. For example:

These are two different string literals with the same value. The compiler may opt to combine these into a single shared string literal, with both name1 and name2 pointed at the same address. Thus, if name1 was not const, making a change to name1 could also impact name2 (which might not be expected).

As a result of string literals being stored in a fixed location in memory, string literals have static duration rather than automatic duration (that is, they die at the end of the program, not the end of the block in which they are defined). That means that when we use string literals, we don’t have to worry about scoping issues. Thus, the following is okay:

In the above code, getName() will return a pointer to C-style string “Alex”. If this function were returning any other kind of literal by address, the literal would be destroyed at the end of getName(), and we’d return a hanging pointer back to the caller. However, because string literals have static duration, “Alex” will not be destroyed when getName() terminates, so the caller can still successfully access it.

To summarize, use a non-const char array when you need a string variable that you can modify later. Use a pointer to a const string literal when you need a read-only string literal.

Rule: Feel free to use C-style string symbolic constants if you need read-only strings in your program, but always make them const!

std::cout and char pointers

At this point, you may have noticed something interesting about the way std::cout handles pointers of different types.

Consider the following example:

On the author’s machine, this printed:

003AF738
Hello!
Alex

Why did the int array print an address, but the character arrays printed strings?

The answer is that std::cout makes some assumptions about your intent. If you pass it a non-char pointer, it will simply print the contents of that pointer (the address that the pointer is holding). However, if you pass it an object of type char* or const char*, it will assume you’re intending to print a string. Consequently, instead of printing the pointer’s value, it will print the string being pointed to instead!

While this is great 99% of the time, it can lead to unexpected results. Consider the following case:

In this case, the programmer is intending to print the address of variable c. However, &c has type char*, so std::cout tries to print this as a string! On the author’s machine, this printed:

Q╠╠╠╠╜╡4;¿■A

Why did it do this? Well, it assumed &c (which has type char*) was a string. So it printed the ‘Q’, and then kept going. Next in memory was a bunch of garbage. Eventually, it ran into some memory holding a 0 value, which it interpreted as a null terminator, so it stopped. What you see may be different depending on what’s in memory after variable c.

This case is somewhat unlikely to occur in real-life (as you’re not likely to actually want to print memory addresses), but it is illustrative of how things work under the hood, and how programs can inadvertently go off the rails.

6.9 -- Dynamic memory allocation with new and delete
Index
6.8a -- Pointer arithmetic and array indexing

124 comments to 6.8b — C-style string symbolic constants

  • Liam

    Hello,

    When I'm trying to create a symbolic constant using constexpr, like so:

    I'm getting a compiler warning:
    "warning: ISO C++ forbids converting a string constant to 'char*' [-Wwrite-strings]"

    While using const instead doesn't trigger a warning. Why so?

    • Hi Liam!

      Here comes the slightly confusing part, constexpr refers to the pointer, not the elements.

      • Liam

        Thank you nascardriver!

        I would recommend to include this part in the tutorial since I would never have guessed this without your explanation! First, I didn't realize how to create a const pointer, instead of a pointer to a const (I just found out that this is covered in a future lesson, that I didn't go through yet). Second, now I see that the constexpr pointer doesn't even follow that syntax and worth separate explanation (took some head-scratching to figure out what's going on, must be worth mentioning constexpr in the 6.10 "Pointers and const" lesson).

  • Nguyen

    Hi nascardriver,

    Unlike 6.6 - C-style strings, C-style string symbolic constants using pointers can not have value based on user input.  
    Is it right?

    Thanks.

  • i'am not sure but shouldn't "Use a const pointer to a string literal" be "Use a pointer to a const string literal" because the pointer can be changed to point to other read only strings but string literal it self can't be changed ?

  • Ishak

    I can do this:

    but I cant do this ? if yes then why?

    [code] const int *value = 5; [code]

    as always thanks for the help

    • nascardriver

      Hi Ishak!

      5 is neither a pointer nor a string. You need to either add quotation marks or a cast.

      • Conor

        This still doesn't make sense to me, this is a pointer declaration/assignment but its being assigned a string value, should it not be something like &someChar previously defined?

        • Whenever you write

          in C++, it's a const char* (an array of characters), not an @std::string. @std::string just has a constructor that accepts a const char *, so you can use a const char * in places where an @std::string is required.

        • Conor McGuigan

          I think I understand it now ("Alex" is placed in memory and myName points to it), its just confusing syntax. Why does "const int *myNumber = 5", its the same concept. Does the compiler not put the value 5 in some memory location and myNumber points to it like the previous case?

          • The type of

            is const char *.
            The type of

            is int.

            You can store a const char * in a const char *, but you cannot store an int in a const int *.

            • Conor McGuigan

              I did a bit of messing around and I'm starting to get it a bit better. I just don't like the varying consistency here, where you use pointer declaration/initialisation in a different context as well as the strange behaviour when you want to print the address of a char*. but it's something you won't be doing regularly so I can not get too annoyed about it. Thanks for replies

              • Keep in mind that an array is nothing but a pointer to the first element.
                This syntax might suit you better

                @std::cout was programmed to print the array value rather than the address when it's passed a const char *. You can print the address by casting the const char * to a const void * or using @std::printf.

  • Xin Shi

    hi, Alex
    how can I get the adress of a char type value when I really want do that?

    • nascardriver

      Hi Xin Shi!

  • John Kennedy

    "In the above code, getName() will return a pointer to C-style string “Alex”. This is okay since “Alex” will not go out of scope when getName() terminates, so the caller can still successfully access it."

    I don't get those lines. What do you mean by telling ""Alex" will not go out of scope"

    • Alex

      When returning a pointer (or reference) from a function, you have to be careful that the object being pointed to (or referenced) does not get destroyed when the function ends. For example:

      This is bad. x is a local variable, and it will get destroyed at the end of the function. The pointer passed back to the caller will be left dangling, and accessing it will result in undefined behavior.

      However, because string literals have special handling, they aren't destroyed at the end of the function. So it's safe to return pointers to them.

  • Peter Baum

    You might consider making the first section "C-style string symbolic constants" clearer by separating the two issues:

    1. using pointers or using an explicit array declaration
    2. the use of const or not

    Because of the previous lesson, I was thinking mostly about 1. and had to go back and discovered that the topic was actually mostly about 2.

    Overall... doing a great job explaining pointers.

    • Alex

      The two considerations are linked:
      * If you use the pointer method, you should always use const
      * If you use the array method, you choose whether to const or not (but if you're going to const, then you might as well use the pointer method)

      I'll add a summary to the end of the section to make it more clear what the takeaways are. Appreciate the feedback!

  • ASP

    A small typo:
    Thus, if name1 (were) not const, making a change to name1 could also impact name2.
    were -> was

  • erad

    Hi Alex,

    I am sorely confused about this segment in this lesson!:

    "For optimization purposes, multiple string literals with the same content may point to the same location. For example:

    These are two different string literals with the same value."

    I thought the string literals are the quoted expressions(i.e. "Alex") on the right of the assignment operator while the objects doing the 'pointing' (i.e. pointers name1 and name2) are the ones on the left.

    1. So why did you write that "multiple string literals ... may point to ... "? String literals are essentially 'values' which theoretically do not point to other objects, right?

    2. With respect to the last line of the quote, did you mean VARIABLES in place of the LITERALS you wrote? Because then it makes more logical sense that the VARIABLES are "different" but could contain the same value. In my view, 'literal' and 'value' are somewhat synonymous. Am I wrong?

  • Alex

    Assuming c is a char, &c will get the address of the char. It's just that you can't print this, because it'll print as a string rather than an address. If you want to print the address, you could try casting it to a void pointer and see if that works -- though if so, I'm not sure whether that behavior is standard across all compilers.

  • Joe

    Hi Alex,

    I think there is a typo in the below lines.

    const char name1 = "Alex";
    const char name2 = "Alex";

    Aren't above lines supposed to be

    const char name1[] = "Alex";
    const char name2[] = "Alex";

  • Kushagra

    Please explain this

    In the symbolic constant case, how the compiler handles this is implementation defined. What usually happens is that the compiler places the string “Alex\0” into read-only memory somewhere, and then sets the pointer to point to it. Multiple string literals with the same content may point to the same location. Because this memory may be read-only, and because making a change to a string literal may impact other uses of that literal, best practice is to make sure the string is const.

    • Alex

      What part of the paragraph do you need clarification on? There's a lot going on there. 🙂

      • Kushagra

        From multiple string literals.

        • Alex

          Gotcha. By multiple string literals, I mean something like this:

          These are two different string literals with the same value. The compiler may opt to combine these into a single shared string literal, with both name1 and name2 pointed at the same address.

  • Lim Che Ling

    Hi, can you enlighten me on this stupid question?
    I can do this:
        char str[] = "myString\n";
        char* ptr = str;
        cout << ptr;  //print 'myString'

    But when I do this, I get warning "ISO C++11 does not allow conversion from string literal to 'char*'
       char *ptr ="myString";

    Why is it so?

  • Astronoid

    Hey Alex,
    Thank you for your efforts; I am confused about many points in that lesson.

    The first one is why the compiler treat "&c" as a string not a pointer; the second one is how to get the memory address of that pointer. And thanks;

    • Alex

      std::cout was designed to treat objects of type (const) char* as strings since this is the native type of C-style string literals. Most of the time, this is what we want. Occasionally it causes issues.
      I'm not sure what you actually mean by "get the memory address of that pointer". &c returns a pointer to variable c, but it doesn't have an address itself since it's an r-value.

      • Harshit

        I also have the same question.
        How can we then get the address of the char variable c then.

        • Rex Lucas

          Hi Harshit
          I had the same question in sect 6.7. If you want to use cout then Alex said "cast to void*" ie

Leave a Comment

Put all code inside code tags: [code]your code here[/code]