Search

6.8a — Pointer arithmetic and array indexing

Pointer arithmetic

The C++ language allows you to perform integer addition or subtraction operations on pointers. If ptr points to an integer, ptr + 1 is the address of the next integer in memory after ptr. ptr - 1 is the address of the previous integer before ptr.

Note that ptr + 1 does not return the memory address after ptr, but the memory address of the next object of the type that ptr points to. If ptr points to an integer (assuming 4 bytes), ptr + 3 means 3 integers (12 bytes) after ptr. If ptr points to a char, which is always 1 byte, ptr + 3 means 3 chars (3 bytes) after ptr.

When calculating the result of a pointer arithmetic expression, the compiler always multiplies the integer operand by the size of the object being pointed to. This is called scaling.

Consider the following program:

On the author’s machine, this output:

0012FF7C
0012FF80
0012FF84
0012FF88

As you can see, each of these addresses differs by 4 (7C + 4 = 80 in hexadecimal). This is because an integer is 4 bytes on the author’s machine.

The same program using short instead of int:

On the author’s machine, this output:

0012FF7C
0012FF7E
0012FF80
0012FF82

Because a short is 2 bytes, each address differs by 2.

Arrays are laid out sequentially in memory

By using the address-of operator (&), we can determine that arrays are laid out sequentially in memory. That is, elements 0, 1, 2, … are all adjacent to each other, in order.

On the author’s machine, this printed:

Element 0 is at address: 0041FE9C
Element 1 is at address: 0041FEA0
Element 2 is at address: 0041FEA4
Element 3 is at address: 0041FEA8

Note that each of these memory addresses is 4 bytes apart, which is the size of an integer on the author’s machine.

Pointer arithmetic, arrays, and the magic behind indexing

In the section above, you learned that arrays are laid out in memory sequentially.

In lesson 6.8 -- Pointers and arrays, you learned that a fixed array can decay into a pointer that points to the first element (element 0) of the array.

Also in a section above, you learned that adding 1 to a pointer returns the memory address of the next object of that type in memory.

Therefore, we might conclude that adding 1 to an array should point to the second element (element 1) of the array. We can verify experimentally that this is true:

Note that when dereferencing the result of pointer arithmetic, parenthesis are necessary to ensure the operator precedence is correct, since operator * has higher precedence than operator +.

On the author’s machine, this printed:

0017FB80
0017FB80
7
7

It turns out that when the compiler sees the subscript operator ([]), it actually translates that into a pointer addition and dereference! Generalizing, array[n] is the same as *(array + n), where n is an integer. The subscript operator [] is there both to look nice and for ease of use (so you don’t have to remember the parenthesis).

Using a pointer to iterate through an array

We can use a pointer and pointer arithmetic to loop through an array. Although not commonly done this way (using subscripts is generally easier to read and less error prone), the following example goes to show it is possible:

How does it work? This program uses a pointer to step through each of the elements in an array. Remember that arrays decay to pointers to the first element of the array. So by assigning ptr to name, ptr will also point to the first element of the array. Each element is dereferenced by the switch expression, and if the element is a vowel, numVowels is incremented. Then the for loop uses the ++ operator to advance the pointer to the next character in the array. The for loop terminates when all characters have been examined.

The above program produces the result:

Mollie has 3 vowels

Because counting elements is common, the algorithms library offers std::count_if, which counts elements that fulfill a condition. We can replace the for-loop with a call to std::count_if.

std::begin returns an iterator (pointer) to the first element, while std::end returns an iterator to the element that would be one after the last. The iterator returned by std::end is only used as a marker, accessing it causes undefined behavior, because it doesn’t point to a real element.

std::begin and std::end only work on arrays with a known size. If the array decayed to a pointer, we can calculate begin and end manually.

Note that we’re calculating name + nameLength, not name + nameLength - 1, because we don’t want the last element, but the pseudo-element one past the last.

Quiz time

Question #1


Why does the following code work?

Show Solution

Question #2


Write a function named find that takes a pointer to the beginning and a pointer to the end (1 element past the last) of an array, as well as a value. The function should search for the given value and return a pointer to the first element with that value, or the end pointer if no element was found. The following program should run:

Tip

std::begin and std::end return an int*. The call to find is equivalent to

Show Solution


6.8b -- C-style string symbolic constants
Index
6.8 -- Pointers and arrays

203 comments to 6.8a — Pointer arithmetic and array indexing

  • Abhinandan goya;

    what's the problem in this code

    • nascardriver

      - You're `using namespace`
      - You're not using list initialization
      - Your variables aren't named descriptively
      - You're using postfix++ instead of ++prefix
      - You're using `std::endl` when you don't need it
      - Your warning level is too low or you're ignoring compiler warnings (Lesson 0.10, 0.11)
      - Your code is inconsistently formatted. Use an auto-formatter to save time

  • salah

    Hi, I did not understand this statament well could you illustrate it please

    When calculating the result of a pointer arithmetic expression, the compiler always multiplies the integer operand by the size of the object being pointed to. This is called scaling.

    • Alex

      Basically, if you have a 32-bit integer (4 bytes), when you advance an int pointer by 2, it will multiple 2 by 4 to get 8. That 8 is the number of memory addresses it needs to move forward.

    • nascardriver

      is

      This makes more sense than just adding 2 to the address, because you'd end up somewhere inside the object and you'd read junk. The automatic multiplication by the object's size makes it so that you get to the next object instead.

  • Sam

    Here is my solution for Problem #2 in the quiz.

  • hausevult

    Hello nascardriver, I am trying to make sure I fully understand how to go about Quiz #2 in a very basic way, but I ran into a problem which quite frankly should not be happening. Am I missing something here?

    Unfortunately, this crashes the program and results in the following error:
    Process returned -1073741819 (0xC0000005)

    Going over the lesson, I do not understand what I could have done wrong here. I hope you can help me :)

    • hausevult

      I now see and have resolved my own error. As I was accessing elements in array "arr[8]", I did not take into account the issue caused by accessing and copying out-of-bounds addresses, such as "arr + arrLength", and that it will yield undefined behavior, causing a segmentation fault error... sometimes. Initially I had thought to assign pointer "found" the address that immediately follows the last address in the array, and to compare as to whether "found" held this address later on. But I realized that this was completely unnecessary, and that I could instead in line 24 I should have used the boolean comparison functionality of null pointers, rather than arbitrarily choosing to set "found" to an out-out-bounds address, and test for that address....

      My fixed code, with no SIGSEGV Segmentation Fault error:

      • nascardriver

        hi!

        Setting `found` to the end (ie. `arr + arrLength`) is not wrong. That's what the standard library does too to indicate that whatever it searched for was not found. The issue in your code was the wrong comparison in line 15, which you already fixed.

        Judging by your code, you know C or you started learning C++ elsewhere. You're writing C++98, doing so will give you a harder time later on. Here are some topics you should catch up on:
        - Brace initialization
        - `std::size`
        - `nullptr`

  • Andrei

    Do we have any reasons to stick asterisk to int instead of begin and end in Solution to Quiz #2?

    Shouldn't it follow your previous advice "When declaring a pointer variable, put the asterisk next to the variable name"?
    Thank you!

  • fxr

    Hello, Nascardriver driver I don't really understand how std::end works cause on my machine when I tried to use this code end is given some nonsense value why is that and what does std::end do shouldn't it just take the last element of the array-like std::begin dose with the first element or is something in my code messing it up.

    • nascardriver

      `std::end` returns the element that would be at arr[length(arr)], but it's not a real element, accessing it causes undefined behavior.

      Your `while`-loop is wrong. If `begin == end` and you increase `begin`, you'll never terminate. It should be

      Note also the change of order. If `begin == end` and you access `*begin`, you're invoking undefined behavior.

      You don't need the `if`-statement after that, it doesn't do anything.

  • kavin

    For quiz 2 i did this.

    I saw your solution. Why are you initializing begin to pointer p? we already know begin is a pointer right?can't we use begin as a address directly? and can i use begin < end , since we know end is " arr + std::size(arr) " and compare 2 addresses to loop through them?

    • nascardriver

      > Why are you initializing begin to pointer p?
      Modifying arguments makes code harder to understand. Once you modify `begin`, it's name is no longer correct, it's not pointing to the beginning of the data.

      > can i use begin < end Yes, there's nothing wrong with it. Using != is more portable, you'll learn more about it in the lesson about iterators.

      • kavin

        Oh ok, thank you. Now i get it. Since we were using variable values in the called function in previous chapters without initializing them to other variable , i thought i could apply it here. So for pointers the best practice is not modify the parameter and initialize them to another pointer and modify that  ?

        And i have another common doubt for quite sometime.I forgot to ask you in previous chapters. Why do u use "return end; " outside of loop? Can't you put an else and use like this,

        • nascardriver

          Modifying parameters is never good if it changes the meaning of the variable.
          Your suggested update doesn't work. The function always returns in the first iteration.

          • kavin

            >The function always returns in the first iteration<

            Oh yes ! I couldn't figure out this issue small issue till now :( Thank you @nascardriver.

  • Suyash

    Here's the code to my solution...

  • chai

    [code]
    std::count_if(name, name + nameLength, isVowel)
    [code]
    What type of functions are compatible with std::count_if? Something similar to bool isVowel(char ), I am guessing that it has to accept a type to be counted and returns a bool and count_if counts the bool returned? .It is also odd and first encounter of a function being passed as argument without ().

    • nascardriver

      The function has to return a `bool` and take an argument of the array element type. We have an array of `char`s, so the function has to accept a char. It could also accept types that can be created from a char, eg. an `int`.
      If we had an array of `std::string`s, the the function would have to accept a `std::string`.

      Algorithms will get more prominent in the tutorials. There will be a short introduction about them once it's clear in which lesson they're first used. It's not too difficult to understand how to use algorithms without knowing how they work, so they're adding them to the tutorials before we explain how they work.

  • chai

    It would be great to have an exercise here.

  • elvis

    in the example where it sorts through mollie to find how many vowels there are. the for loop has this condition in it ptr < (name + arrayLength) how does it add name and int? I tried printing "name + arrayLength" and i get random garbage. does name and array length convert to pointers when they are being compared to ptr?

  • Ged

    Code is missing the <algorithm> library. And why are we using std::size_t, because when I try to run it I get an error and if I change it to an int, it works. Only if I use the static_cast<std::size_t>( ) it works, but why do the extra work? As I understand size_t is an unsigned int which you told to avoid if you can.

    • nascardriver

      > Code is missing the library
      Added.

      > why are we using std::size_t
      I thought `std::count_if` returned an `std::size_t`. It returns a `std::ptrdiff_t` in this case. Code updated to use `auto`.

      > As I understand size_t is an unsigned int which you told to avoid if you can.
      It's an unsigned integer, but not necessarily an unsigned int. If you don't modify an unsigned integer and don't use it for arithmetic, there's nothing that can go wrong. If `std::count_if` returned a `std::size_t`, we would've needed an extra cast when all we want to do is print the result.

      Thanks!

  • Ged

    It turns out that when the compiler sees the subscript operator ([]), it actually translates that into a pointer addition and dereference! Generalizing, array[n] is the same as *(array + n), where n is an integer. The subscript operator [] is there both to look nice and for ease of use (so you don’t have to remember the parenthesis).

    Isn't [] used to save a changed value which *() can't do?

  • Jack Overby

    I'm trying to loop through and perform a regex check on each character, rather than switch-10 different cases:

    However, I keep getting the following compiler error message:

    no instance of overloaded function "std::regex_match" matches the argument list

    Any suggestions?

    • nascardriver

      Regex shouldn't be used for this. It has a huge overhead compared to manually solving the task.
      You'll get the best results from using a `switch`. I understand you don't want to do that.
      You can use an `std::set` and `std::count_if` instead.

      If you really wanted to use regex, which you shouldn't, you could use an `std::sregex_iterator` and `std::distance`. `std::sregex_iterator` iterates over all found matches.

  • alfonso

    Here the char array name does not decay or std::cout treats it in a special way.

    And maybe for the same reason, the following code gives me strange results:

Leave a Comment

Put all code inside code tags: [code]your code here[/code]