Search

6.8a — Pointer arithmetic and array indexing

Pointer arithmetic

The C++ language allows you to perform integer addition or subtraction operations on pointers. If ptr points to an integer, ptr + 1 is the address of the next integer in memory after ptr. ptr - 1 is the address of the previous integer before ptr.

Note that ptr + 1 does not return the memory address after ptr, but the memory address of the next object of the type that ptr points to. If ptr points to an integer (assuming 4 bytes), ptr + 3 means 3 integers (12 bytes) after ptr. If ptr points to a char, which is always 1 byte, ptr + 3 means 3 chars (3 bytes) after ptr.

When calculating the result of a pointer arithmetic expression, the compiler always multiplies the integer operand by the size of the object being pointed to. This is called scaling.

Consider the following program:

On the author’s machine, this output:

0012FF7C
0012FF80
0012FF84
0012FF88

As you can see, each of these addresses differs by 4 (7C + 4 = 80 in hexadecimal). This is because an integer is 4 bytes on the author’s machine.

The same program using short instead of int:

On the author’s machine, this output:

0012FF7C
0012FF7E
0012FF80
0012FF82

Because a short is 2 bytes, each address differs by 2.

Arrays are laid out sequentially in memory

By using the address-of operator (&), we can determine that arrays are laid out sequentially in memory. That is, elements 0, 1, 2, … are all adjacent to each other, in order.

On the author’s machine, this printed:

Element 0 is at address: 0041FE9C
Element 1 is at address: 0041FEA0
Element 2 is at address: 0041FEA4
Element 3 is at address: 0041FEA8

Note that each of these memory addresses is 4 bytes apart, which is the size of an integer on the author’s machine.

Pointer arithmetic, arrays, and the magic behind indexing

In the section above, you learned that arrays are laid out in memory sequentially.

In lesson 6.8 -- Pointers and arrays, you learned that a fixed array can decay into a pointer that points to the first element (element 0) of the array.

Also in a section above, you learned that adding 1 to a pointer returns the memory address of the next object of that type in memory.

Therefore, we might conclude that adding 1 to an array should point to the second element (element 1) of the array. We can verify experimentally that this is true:

Note that when dereferencing the result of pointer arithmetic, parenthesis are necessary to ensure the operator precedence is correct, since operator * has higher precedence than operator +.

On the author’s machine, this printed:

0017FB80
0017FB80
7
7

It turns out that when the compiler sees the subscript operator ([]), it actually translates that into a pointer addition and dereference! Generalizing, array[n] is the same as *(array + n), where n is an integer. The subscript operator [] is there both to look nice and for ease of use (so you don’t have to remember the parenthesis).

Using a pointer to iterate through an array

We can use a pointer and pointer arithmetic to loop through an array. Although not commonly done this way (using subscripts is generally easier to read and less error prone), the following example goes to show it is possible:

How does it work? This program uses a pointer to step through each of the elements in an array. Remember that arrays decay to pointers to the first element of the array. So by assigning ptr to name, ptr will also point to the first element of the array. Each element is dereferenced by the switch expression, and if the element is a vowel, numVowels is incremented. Then the for loop uses the ++ operator to advance the pointer to the next character in the array. The for loop terminates when all characters have been examined.

The above program produces the result:

Mollie has 3 vowels
6.8b -- C-style string symbolic constants
Index
6.8 -- Pointers and arrays

66 comments to 6.8a — Pointer arithmetic and array indexing

  • Russell

    Minor update to the example of iterating through an array using pointers:

    to

    .

  • p.zekri

    hello
    I have a question
    how can I write a code in c++ that include a function which its return type is a pointer to an index of n array?!

  • Advokat Hadzitonic

    Alex, how do you output addresses in decimal notation ?

    • Advokat Hadzitonic

      And what i meant by that is: How do you output pointer variable value (memory address) in decimal notation (as a decimal value), and not in hex witch is by default, in this example:
      I tried with the std::cout<<std::dec; but it wont work with pointers?

      #include <iostream>

      int main()
      {
          int value = 7;
          int *ptr = &value;

          std::cout << ptr << 'n';
          std::cout << ptr+1 << 'n';
          std::cout << ptr+2 << 'n';
          std::cout << ptr+3 << 'n';

          return 0;
      }

    • Alex

      I’m not sure what the best way to do this would be. Why would you want to do this anyway? 🙂

      • Advokat Hadzitonic

        Just for readability. It’s easier to compare addresses in decimal notation for me. In C  printf("adress is %u",ptr); would print in decimal by default. It would be in decimal even if you put %d instead of %u.

        • Advokat Hadzitonic

          I found a solution on: http://en.cppreference.com/w/cpp/language/reinterpret_cast

          Code will look like this:
          #include <iostream>

          int main()
          {
              int value = 7;
              int *ptr = &value;

              std::cout << std::uintptr_t(ptr)<< ‘\n’;
              std::cout << std::uintptr_t(ptr+1)<<‘\n’;
              std::cout << std::uintptr_t(ptr+2)<<‘\n’;
              std::cout << std::uintptr_t(ptr+3) <<‘\n’;

              return 0;
          }

          And the output would be in decimal:

          6946552
          6946556
          6946560
          6946564

          And now you can clearly see by looking on fist two digits (ones and tens places) of each adresses  that, if you increment a pointer to int by one, your increment is by 4 bytes. That was my point 🙂

        • Alex

          You can still use printf() in C++. It’s part of the <cstdio> header. So if that method works, go for it.

  • Nurlan

    Hello
    Alex!
    I hope you are doing well! I have a question regarding to the size of pointer itself. As below shown program i gave example of size of pointer itself. My machine is 64-bit one.Could you clarify that the pointer uses 8 byte address regardless of what data type they are pointing to, and always have fixed size.I am amazed that the pointer uses 8 byte memory address, whereas it can point 2 byte memory address, so a bit confusing.
    Thank you in advance!  
    include <iostream>
    /*
    int main()
    {
        int value = 7;
        int *ptr = &value;

        std::cout << ptr << ‘\n’;//prints 0X23fe3c
        std::cout<<sizeof(ptr)<<‘\n’;//8
        std::cout << (ptr+1)<<‘\n’;//0x23fe40
        std::cout<<sizeof(ptr+1)<<‘\n’;8
          
        short ch(5);
        short *ptrch=&ch;
        std::cout<<"pointer to character: "<<sizeof(ptrch)<<‘\n’;//pointer to character:8
        return 0;
    }

    • Alex

      A 64-bit application will always have 8-byte pointers. Remember that most modern systems are byte-addressable. That 8-bytes identifies a specific byte address. For multi-byte types (basically everything that isn’t a char), the pointer will point to the start of the value, and the type tells the compiler how many bytes to read.

      So, for example, an pointer to a 4-byte integer would point to the first byte of the integer, and the compiler would know to interpret the next 4 bytes as the integer value.

  • gprc

    One of your older comment say that : "if pointer is an integer pointer, then pointer + 1 will move the address by 4 bytes. If you then dereference that pointer, it will interpret the next 4 bytes starting at that address as an integer.".

    Testing in visual studio:

    and the results are somewhat confusing

    The number -858993460 (0xcccccccc in hex) has multiple occurences here and it is, according to wikipedia, a known "magic" number in microsoft compilers: "Used by Microsoft’s C++ debugging runtime library and many DOS environments to mark uninitialized stack memory"

    So it seems that at least in this case the number it’s not the result of interpreting as integer the arbitray bits that happened to stay there.

    Other numbers are clear too : 1, 2, 3 (elements of array, already valid integers)

    I don’t know how to interpret the rest of numbers. Are these valid integers? (from where?)

    • Alex

      > So it seems that at least in this case the number it’s not the result of interpreting as integer the arbitray bits that happened to stay there.

      It is exactly that. How variables get laid out in the stack is up to the compiler. The compiler may opt to pad variables for performance reasons (which may account for the 0xcccccccc values). The other values are other bits of data in the stack (other local variables, function parameters, function calls, etc…) being interpreted as an integer.

      You’re essentially asking random memory addresses to print their results as integers. What you get is going to be undefined, based on how the compiler laid out your program and what values happened to already be in those addresses.

  • Mohammad

    hello, on the last example
    firts you used name to access the adress of where the array name starts(line 4), then you used name to access the entire array in name(line 22). name is used to get 2 different things, and thats confusing me.

    • Mohammad

      nevermind, its because of how std::cout deals with char

    • Alex

      There’s really no difference. In both cases, name decays into a char pointer that points to the first element of the array. In the top case, we use this as the starting point to increment through the array and count the number of vowels. In the bottom case, we’re passing this pointer to std::cout, which assumes that pointers of type char* should be printed as a string. std::cout increments through the array, printing each character to the console.

  • Slayther

    "If ptr points to an integer (assuming 4 bytes), ptr + 3 means 3 integers after ptr, which is 12 memory addresses after ptr. If ptr points to a char, which is always 1 byte, ptr + 3 means 3 chars after ptr, which is 3 memory addresses after ptr."

    It should be 3 memory addresses and not 12. (Not that a memory address is a defined unit, but it’s fine for an explanation)
    Unless you meant bytes, which should be then corrected as well.

    • Slayther

      Also,

      "ptr + 3 means 3 integers after ptr"

      It might be odd to talk about integers instead of memory addresses here. You might want to find a different way to phrase this, if you agree.

      • Alex

        I think it makes sense to talk about it that way -- because when you do pointer arithmetic, the pointer moves relative to the size of the object being pointed to. If ptr were pointing to a double, ptr+3 would move 3 doubles. Nevertheless, I’ve tweaked the wording slightly, moving away from “memory addresses” to “bytes”, even though they’re essentially synonymous for byte-addressable architectures.

  • Kılıçarslan

    Hey Alex,do you think is this a good alternative to iterate through an array using pointers?

  • Raquib

    Hi Alex,

    I have a simple query-

    I ran the above code and got the o/p as -

    Address of first element using address-of-operator(&): 0x7ffe b215 7f00
    Address of second element using address-of-operator(&): 0x7ffe b215 7f04
    Address of third element using address-of-operator(&): 0x7ffe b215 7f08

    Size f the pointer: 8

    Suggesting I am running on a 64-bit machine, which is true.

    But, when I am looking at the address- why is the address 48-bit??

    In your example you were running on a 32-bit machine, the size of the pointer was 4 byte (32-bit) and the address was something like -  0x 0012 FF7C (32-bit address).

    What am I missing??

    Regards,
    Raquib Buksh

    • Alex

      Are you on an AMD 64-bit machine? As I understand it, the machine has 64-bits available for addressing -- but to save on transistors, those CPUs only actually use the lower 48-bits for addressing (the top 16 bits are ignored). Thus, even though you have 64-bits available, you get addresses that only contain 3 bytes. 48-bits is enough to address 256 terabytes of memory, so you won’t be hitting that anytime soon.

  • rtz

    given the base address  of a pointer chain is it possible to reach the address where the actual value is stored.what i have been trying to achieve is something like this.
        

    how do i implement this?

    • Alex

      I’m not sure I understand what the point of this exercise is. If you already know that base is of type int****, then just dereference base 3 times and you’ll have a pointer to your value. That pointer will be holding the address of your value.

  • Nyap

    so doing pointer+1 doesn’t find the next valid integer in memory, it just moves the memory address in pointer by 4 and the next 4 bytes are interpreted as an integer (so if there’s supposed to be a double there it’s just interpreted as an integer)

    • Alex

      Yep, if pointer is an integer pointer, then pointer+1 will move the address by 4 bytes. If you then dereference that pointer, it will interpret the next 4 bytes starting at that address as an integer.

  • Chris

    Alex,

    i have questions about pointer arithmetic on integer array. each element address separate by 3 memory address. the questions is:

    1. is it true if 1 memory address just capable for 1 byte data? so allocate int variable must have 4 memory address, if short variable must have 2 memory address?

    2. how about the 3 memory address? how to access them? what is in there?

    Thank you.

    • Alex

      Yes, most modern machines are byte-addressable, meaning each memory address can be used to access one byte. For variables that need more than one byte, they just use consecutive memory addresses (e.g. a 32-bit integer uses 4 consecutive memory addresses). Therefore, if you have an array of 32-bit integers, each one will be separated by 4 bytes.

      Pointer arithmetic accounts for this. ptr+1 will add one address if ptr is pointing to a 1-byte char, but it will add 4 if ptr is pointing to a 4-byte integer.

      • nikos-13

        Yes, but in this example:

        value has the address 0012FF7C. But 0012FF7C is already 4 bytes!(in lesson 2.8 - Literals you say that a pair of hexadecimal digits can be used to exactly represent a full byte, in value’s address we have 4 pairs of hexadecimal digits). So why ptr+1 address is 0012FF80 and not 0012FF7D ?

        • Alex

          You’re confusing the length of the memory address with the number of bytes used to represent a variable.

          On 32-bit machines, each memory address is 32 bits (4 bytes) long. However, memory is byte-addressable, so each address represents a single byte.

          Thus 0012FF7C represents one byte, and 0012FF7D represents the next sequential byte in memory.

          When you do pointer arithmetic, the +1 doesn’t move one byte -- it moves one “object” forward, based on the sizeof(object). In this case, since ptr is an int pointer, and int pointers on your machine are 4 bytes, ptr + 1 moves forward 4 bytes.

  • Indorfin

    I thought this was worth posting just to show how I’ve kept including the new material; thanks again for these tutorials!

  • Shiva

    Alex,

    Beneath the last example you wrote, ‘First, we start with a pointer assigned to array, which points to element 0.’ The wording feels a bit odd; surely it should be something like: ‘First, we start with the array assigned to a pointer /* ∵ char *ptr = name; */, which then points to element 0 of the array’, right? Also in your sentence the which-clause seems to refer to ‘array’ when it should refer to ‘a pointer’. Anyway, I’m not a native English speaker, so I could be wrong.

    Regarding the lesson, correct me if I’m wrong:
    > The subscript operator is an operator in C++ that takes a pointer (not an array) and an integer as operands.
    > Thus when it is used on an array variable to obtain one of its elements, the array variable actually decays in to a pointer before the operation is performed.
    > When it comes in an expression, an array variable first decays into a pointer, which then evaluates to the address it is holding, that is to say the base address of the array.

    To summarise, wherever an array identifier shows pointer behaviour, are the places where it loses it’s array-ness and decays into pointer-ness. And indexing is one of such places. Is my understanding correct?

    • Alex

      I’ve updated the wording of the sentence you pointed out. The name of the array variable used to be array, but I changed it and missed the fact that it was used in the sentence too. Thanks for pointing that out.

      All of your other sentences appear to be correct to the best of my understanding.

  • lance

    what is the purpose of the break in the switch in the final program. i ran it without it on mollie and Mmmmmm and they both worked.

  • Rob G.

    Thx Alex the exercises and tutorials don’t feel trivial at all. Please keep up the great job.

  • Rob G.

    Alex, when posting, is it more appropriate for your site to post an .hpp file with .cpp -- for function calls -- or post a single .cpp file that includes the functions? Some of the programs are starting to grow.

    Thx Rob G.

    • Alex

      .h files are really only needed if you have declarations you plan to share across multiple files. For trivial exercises and tutorials, it’s mostly not necessary.

  • Rob G.

    I forgot to put that as the adjustment to convert bubble sort to optimized bubble sort above. Sorry!

  • Rob G.

    Optimized bubble sort below

    ->Adjustment:

    . Element num is decremented outside of the inner loop, reducing the outer limit value of the inner loop when it recycles. By correctly reducing the outer limit range, values already analyzed are not analyzed again. Thus the inner loop recycles without a redundancy in scanning the array.

  • Rob G.

    Bubble sort w/ pointers working example below

    Hi Alex I was inspired by the sections on pointers so I applied them to the bubble sort a few exercises back. It took a while but I think I understand pointers even better now. I had some difficulties at first with  "lvalue required as left operand of assignment" that were resolved by correctly applying value-to-value, e.g. (pp_ptr+1) = value_1 (error);
    correct: *(pp_ptr+1) = value_1.  ->*ptr is treated the same as value,ptr is the same as &value.Thanks for such a great site Alex.

  • Awesome tutorials. This acts as an additional help for me after school. Thanks to the author.

  • Mr D

    Hi Alex,

    I’ve been messing round with the code from this lesson and i can’t work out why:

    prints out:

    Mollie
    ollie
    llie
    lie
    ie
    e

    So when a pointer points to a certain element of an array, it actually points to that element plus everything else after it in the array?

    • Alex

      No, a pointer that points to a certain element of the array only points to that element.

      The issue you are seeing here is a result of the way std::cout work. std::cout treats objects of type char* as a C-style string. So if you pass it a char* pointer, it will print everything starting from that element until it hits a null terminator.

  • kyle

    can you please explain what "ptr < name + arraySize" means? I get for loops, i just don’t understand how you added name and arraysize and what the sum would be. I copied your example into codeblocks and added "std::cout << name + arraySize;", to try to find out but all it does is cause a beeping noise (and no errors).

    • Alex

      Sure. We know that name points to the first element of the array, right? So name maps to index 0. We also know that using pointer arithmetic, name + 1 maps to index 1.

      Based on this, we can draw up a table correlating the index with the memory address, like this:

      Index		Address
      0		name
      1		name + 1
      2		name + 2
      ...		...
      arraySize	name + arraySize
      

      ptr is originally set to the address held in array name, which is the address of element 0. As we loop through the array, we increment ptr by 1 each time, which moves us to the next element. We stop when we get to name + arraySize, because that’s the memory address of the element just beyond the end of the array.

      Basically, instead of iterating through the array based on the left hand column (index), we’re iterating through the array based on the right hand column (memory address).

      Make sense?

      • Baubas

        If it is a memory address name + arraySize why we can’t get that address? Now cout shows something crazy, not the memory address.

        • Alex

          You can get the address. What are you seeing?

          • Baubas

            I tried this: cout<<name + arraySize; and got the garbage. But then I tried this: cout<<name + arraySize-1; and got the empty space and finally I understood what was going here.

            for (char *ptr = name; ptr < name + arraySize-1; ++ptr)
            I think it would be more easily to understand for beginners. The last symbol is the empty space so we can add -1, it doesn’t hurt but it is more clearly. You can separately  cout<< both ptr and name + arraySize-1 and compare the result.

            Until now I mix how to write the address and the value . Then I get that garbage I feel so stupid 😀

            • Alex

              No, that’s definitely not better, and that only works for strings (which have a null terminator), not other kinds of data (e.g. arrays).

              Remember that the conditional in the for loop is “ptr < name + arraySize” -- note the less than. That means this loop never executes the iteration where ptr == name + arraySize.

  • 1. "Note that ptr + 1 does not return the memory address after ptr, but the next object of the type that ptr points to"

    shouldn’t it be:

    Note that ptr + 1 does not return the memory address after ptr, but the address of next object of the type that ptr points to"
    or something like that..

    2. "In lesson 6.8 -- Pointers and arrays, you learned that an array can decay into to a pointer that points to the first element (element 0) of the array."

    Note that "into to" part. Remove "to" from the above sentence.

  • Simon

    Alex, possible small issue here:

    The given result above:
    7
    7
    0017FB80
    0017FB80

    Looks to be inverted, given the code example above it.

  • programmer, another one

    just wanna say great that these are great tutorials, and nice to see that you are back.

    also, will there be any tutorials on linked lists anytime soon?

    thanks

    • Alex

      Probably not, since linked lists are more of an data structures topic than a C++ topic.

      I’d like to get there someday, but right now my focus is on the core language concepts.

Leave a Comment

Put C++ code inside [code][/code] tags to use the syntax highlighter