Search

6.8a — Pointer arithmetic and array indexing

Pointer arithmetic

The C++ language allows you to perform integer addition or subtraction operations on pointers. If ptr points to an integer, ptr + 1 is the address of the next integer in memory after ptr. ptr - 1 is the address of the previous integer before ptr.

Note that ptr + 1 does not return the memory address after ptr, but the memory address of the next object of the type that ptr points to. If ptr points to an integer (assuming 4 bytes), ptr + 3 means 3 integers (12 bytes) after ptr. If ptr points to a char, which is always 1 byte, ptr + 3 means 3 chars (3 bytes) after ptr.

When calculating the result of a pointer arithmetic expression, the compiler always multiplies the integer operand by the size of the object being pointed to. This is called scaling.

Consider the following program:

On the author’s machine, this output:

0012FF7C
0012FF80
0012FF84
0012FF88

As you can see, each of these addresses differs by 4 (7C + 4 = 80 in hexadecimal). This is because an integer is 4 bytes on the author’s machine.

The same program using short instead of int:

On the author’s machine, this output:

0012FF7C
0012FF7E
0012FF80
0012FF82

Because a short is 2 bytes, each address differs by 2.

Arrays are laid out sequentially in memory

By using the address-of operator (&), we can determine that arrays are laid out sequentially in memory. That is, elements 0, 1, 2, … are all adjacent to each other, in order.

On the author’s machine, this printed:

Element 0 is at address: 0041FE9C
Element 1 is at address: 0041FEA0
Element 2 is at address: 0041FEA4
Element 3 is at address: 0041FEA8

Note that each of these memory addresses is 4 bytes apart, which is the size of an integer on the author’s machine.

Pointer arithmetic, arrays, and the magic behind indexing

In the section above, you learned that arrays are laid out in memory sequentially.

In lesson 6.8 -- Pointers and arrays, you learned that a fixed array can decay into a pointer that points to the first element (element 0) of the array.

Also in a section above, you learned that adding 1 to a pointer returns the memory address of the next object of that type in memory.

Therefore, we might conclude that adding 1 to an array should point to the second element (element 1) of the array. We can verify experimentally that this is true:

Note that when dereferencing the result of pointer arithmetic, parenthesis are necessary to ensure the operator precedence is correct, since operator * has higher precedence than operator +.

On the author’s machine, this printed:

0017FB80
0017FB80
7
7

It turns out that when the compiler sees the subscript operator ([]), it actually translates that into a pointer addition and dereference! Generalizing, array[n] is the same as *(array + n), where n is an integer. The subscript operator [] is there both to look nice and for ease of use (so you don’t have to remember the parenthesis).

Using a pointer to iterate through an array

We can use a pointer and pointer arithmetic to loop through an array. Although not commonly done this way (using subscripts is generally easier to read and less error prone), the following example goes to show it is possible:

How does it work? This program uses a pointer to step through each of the elements in an array. Remember that arrays decay to pointers to the first element of the array. So by assigning ptr to name, ptr will also point to the first element of the array. Each element is dereferenced by the switch expression, and if the element is a vowel, numVowels is incremented. Then the for loop uses the ++ operator to advance the pointer to the next character in the array. The for loop terminates when all characters have been examined.

The above program produces the result:

Mollie has 3 vowels

Because counting elements is common, the algorithms library offers std::count_if, which counts elements that fulfill a condition. We can replace the for-loop with a call to std::count_if.

std::begin returns an iterator (pointer) to the first element, while std::end returns an iterator to the element that would be one after the last. The iterator returned by std::end is only used as a marker, accessing it causes undefined behavior, because it doesn’t point to a real element.

std::begin and std::end only work on arrays with a known size. If the array decayed to a pointer, we can calculate begin and end manually.

Note that we’re calculating name + nameLength, not name + nameLength - 1, because we don’t want the last element, but the pseudo-element one past the last.

Calculating begin and end of an array like this works for all algorithms that need a begin and end argument.

Quiz time

Question #1


Why does the following code work?

Show Solution

Question #2


Write a function named find that takes a pointer to the beginning and a pointer to the end (1 element past the last) of an array, as well as a value. The function should search for the given value and return a pointer to the first element with that value, or the end pointer if no element was found. The following program should run:

Tip

std::begin and std::end return an int*. The call to find is equivalent to

Show Solution


6.8b -- C-style string symbolic constants
Index
6.8 -- Pointers and arrays

222 comments to 6.8a — Pointer arithmetic and array indexing

  • Periklis

    This is the solution that you presented for quiz 2!

    why you are using p != end instead of p < end ?

    • nascardriver

      As long as we're using pointers, we could use `p < end`. However, the solution shown in the quiz works for all iterators (Iterators are covered later. Pointers into arrays are iterators). Not all iterators support `p < end`, but almost all iterators support `p != end`.

  • RJ

    To summarize this lesson in my own words. Please let me know if i'm wrong.

    When performing operations on non-derefenced pointers, literals are implicitly converted to their size in bytes and multiplied by the literal itself. Basicly the same as using the sizeof() funcion and multiplying that result by the value passed in as an argument. These values are also treated as hex values.

    • nascardriver

      You got the first part right.
      Hexadecimal is just a representation, it doesn't change the value of anything and "a hex value" has no meaning. It doesn't matter if you call your friend Joseph or Joe, it's still the same person.

  • shawn

    Why do you recommend using *ptr on an earlier lesson but then use int* ptr in solution to question #2?

    If I remember right, the recommendation was to use *ptr function_name, but with variables to use int *variable_name.

    • nascardriver

      Hi!

      I wrote the examples with `type* name` syntax, you'll see more of them. I'm starting to update these examples to use `type *name` as well.

  • Alek

    hey!,I got a question about this part :

    isVowel is a function yet you are kind of calling it here like :isVowel ? no paranthesis"()" it expects a parameter and we're not giving it any how does it know to pass each elemnts of begin to it ?I struggled too much but it doesn't seem I cant get,also I checked cppreference which had scarrier exp xD here it is :

    can you tell me what's happening in the count_if part ? the third parameter isn't vivid for me "[](int i)"? what is this,never seen such a syntax.
    can you pls tell me the difference between "count"& "count_if" ? count seems like to work like the count_if you utilized above.
    thanks in advance!

    • nascardriver

      When a function is used without parentheses() it's a function pointer. We cover function pointers later. `std::count_if` is calling `isVowel` with each element (character) of `name`. The example on cppreference uses a lambda (anonymous function). We show lambdas in chapter 7.

  • koe

    In the std::count_if example I get an error using 'auto' for the numVowels. Instead, the type has to be 'long'. Compiler: Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)

    • nascardriver

      What's the error message? If you're still not using C++17, try using direct initialization

      `auto` didn't work well with list initialization before C++17.

  • Gabe

    Quiz 2

    I recommend stepping through with break points with this one. It's interesting to see
    how it works at a low level! :)

  • Carl

    Question 2:
    Have I misunderstood? "....The function should search for the given value and
    return a pointer to the first element with that value,
    or the end pointer if no element was found....."

    The solution as written returns the value 20 not a pointer to the element hence I finished up with:-

    • nascardriver

      The solution returns a pointer to the element with value 20, because that element existed. If there was no element with value 20, it'd return `end`. Have another look at the solution, we're returning an `int*`, not an `int`.

      • Carl

        Being pedantic the question asks for '..return a pointer to the first element with that value.' not the value of the element which was the reason for my confusion. I understood the solution as written. Thanks.

        Can I also add here what a godsend this site has been over the foregoing few weeks. Being basically housebound because of this blasted virus it is a great help to have some 'mind fodder'. Thank you.

        • nascardriver

          Exactly, you wrote in your original comment that "The solution as written returns the value 20". But that's not true. The quiz didn't ask to return a value and it doesn't return a value. It always talks about a pointer. The pointer points to an element with the value 20.

  • Robbas

    Hi Nascar and Alex, I have a problem with the cout. I can't extract the address for each element of the array.
    I tried with the printf and as you can see it works, where am I doing wrong?

    Output:
    name address [0]: 002DFC84 , M
    name address [0]: Mollie , M

    name address [1]: 002DFC85 , o
    name address [1]: ollie , o

    name address [2]: 002DFC86 , l
    name address [2]: llie , l

    name address [3]: 002DFC87 , l
    name address [3]: lie , l

    name address [4]: 002DFC88 , i
    name address [4]: ie , i

    name address [5]: 002DFC89 , e
    name address [5]: e , e

    P.S. I checked arrayLength and it's 7 , because it's Mollie + \0, shouldn't it be arrayLength -1 in the for?
    from 0 to 5 it checked all the letters, to 6 it checks \0, is this step necessary?  

    I tried with arrayLength-1 and it works so I'm a little confused in this part.

    Thank you in advance

    • nascardriver

      You don't need to print the zero-terminator. You can ignore it or use it as your loop's condition.

      You're printing

      The type of `name` is `char*` (A string).
      The type of `name[i]` is `char`.
      The type of `&name[i]` is `char*` (A string).

      If you don't want `std::cout` to print a string, cast `&name[i]` to a `void*` first. We cover `void*` later in this chapter.

  • Dudz

    My answer for question 2.

  • Abhinandan goya;

    what's the problem in this code

    • nascardriver

      - You're `using namespace`
      - You're not using list initialization
      - Your variables aren't named descriptively
      - You're using postfix++ instead of ++prefix
      - You're using `std::endl` when you don't need it
      - Your warning level is too low or you're ignoring compiler warnings (Lesson 0.10, 0.11)
      - Your code is inconsistently formatted. Use an auto-formatter to save time

  • salah

    Hi, I did not understand this statament well could you illustrate it please

    When calculating the result of a pointer arithmetic expression, the compiler always multiplies the integer operand by the size of the object being pointed to. This is called scaling.

    • Alex

      Basically, if you have a 32-bit integer (4 bytes), when you advance an int pointer by 2, it will multiple 2 by 4 to get 8. That 8 is the number of memory addresses it needs to move forward.

    • nascardriver

      is

      This makes more sense than just adding 2 to the address, because you'd end up somewhere inside the object and you'd read junk. The automatic multiplication by the object's size makes it so that you get to the next object instead.

  • Sam

    Here is my solution for Problem #2 in the quiz.

  • hausevult

    Hello nascardriver, I am trying to make sure I fully understand how to go about Quiz #2 in a very basic way, but I ran into a problem which quite frankly should not be happening. Am I missing something here?

    Unfortunately, this crashes the program and results in the following error:
    Process returned -1073741819 (0xC0000005)

    Going over the lesson, I do not understand what I could have done wrong here. I hope you can help me :)

    • hausevult

      I now see and have resolved my own error. As I was accessing elements in array "arr[8]", I did not take into account the issue caused by accessing and copying out-of-bounds addresses, such as "arr + arrLength", and that it will yield undefined behavior, causing a segmentation fault error... sometimes. Initially I had thought to assign pointer "found" the address that immediately follows the last address in the array, and to compare as to whether "found" held this address later on. But I realized that this was completely unnecessary, and that I could instead in line 24 I should have used the boolean comparison functionality of null pointers, rather than arbitrarily choosing to set "found" to an out-out-bounds address, and test for that address....

      My fixed code, with no SIGSEGV Segmentation Fault error:

      • nascardriver

        hi!

        Setting `found` to the end (ie. `arr + arrLength`) is not wrong. That's what the standard library does too to indicate that whatever it searched for was not found. The issue in your code was the wrong comparison in line 15, which you already fixed.

        Judging by your code, you know C or you started learning C++ elsewhere. You're writing C++98, doing so will give you a harder time later on. Here are some topics you should catch up on:
        - Brace initialization
        - `std::size`
        - `nullptr`

  • Andrei

    Do we have any reasons to stick asterisk to int instead of begin and end in Solution to Quiz #2?

    Shouldn't it follow your previous advice "When declaring a pointer variable, put the asterisk next to the variable name"?
    Thank you!

  • fxr

    Hello, Nascardriver driver I don't really understand how std::end works cause on my machine when I tried to use this code end is given some nonsense value why is that and what does std::end do shouldn't it just take the last element of the array-like std::begin dose with the first element or is something in my code messing it up.

    • nascardriver

      `std::end` returns the element that would be at arr[length(arr)], but it's not a real element, accessing it causes undefined behavior.

      Your `while`-loop is wrong. If `begin == end` and you increase `begin`, you'll never terminate. It should be

      Note also the change of order. If `begin == end` and you access `*begin`, you're invoking undefined behavior.

      You don't need the `if`-statement after that, it doesn't do anything.

  • kavin

    For quiz 2 i did this.

    I saw your solution. Why are you initializing begin to pointer p? we already know begin is a pointer right?can't we use begin as a address directly? and can i use begin < end , since we know end is " arr + std::size(arr) " and compare 2 addresses to loop through them?

    • nascardriver

      > Why are you initializing begin to pointer p?
      Modifying arguments makes code harder to understand. Once you modify `begin`, it's name is no longer correct, it's not pointing to the beginning of the data.

      > can i use begin < end Yes, there's nothing wrong with it. Using != is more portable, you'll learn more about it in the lesson about iterators.

      • kavin

        Oh ok, thank you. Now i get it. Since we were using variable values in the called function in previous chapters without initializing them to other variable , i thought i could apply it here. So for pointers the best practice is not modify the parameter and initialize them to another pointer and modify that  ?

        And i have another common doubt for quite sometime.I forgot to ask you in previous chapters. Why do u use "return end; " outside of loop? Can't you put an else and use like this,

        • nascardriver

          Modifying parameters is never good if it changes the meaning of the variable.
          Your suggested update doesn't work. The function always returns in the first iteration.

          • kavin

            >The function always returns in the first iteration<

            Oh yes ! I couldn't figure out this issue small issue till now :( Thank you @nascardriver.

  • Suyash

    Here's the code to my solution...

  • chai

    [code]
    std::count_if(name, name + nameLength, isVowel)
    [code]
    What type of functions are compatible with std::count_if? Something similar to bool isVowel(char ), I am guessing that it has to accept a type to be counted and returns a bool and count_if counts the bool returned? .It is also odd and first encounter of a function being passed as argument without ().

    • nascardriver

      The function has to return a `bool` and take an argument of the array element type. We have an array of `char`s, so the function has to accept a char. It could also accept types that can be created from a char, eg. an `int`.
      If we had an array of `std::string`s, the the function would have to accept a `std::string`.

      Algorithms will get more prominent in the tutorials. There will be a short introduction about them once it's clear in which lesson they're first used. It's not too difficult to understand how to use algorithms without knowing how they work, so they're adding them to the tutorials before we explain how they work.

  • chai

    It would be great to have an exercise here.

  • elvis

    in the example where it sorts through mollie to find how many vowels there are. the for loop has this condition in it ptr < (name + arrayLength) how does it add name and int? I tried printing "name + arrayLength" and i get random garbage. does name and array length convert to pointers when they are being compared to ptr?

  • Ged

    Code is missing the <algorithm> library. And why are we using std::size_t, because when I try to run it I get an error and if I change it to an int, it works. Only if I use the static_cast<std::size_t>( ) it works, but why do the extra work? As I understand size_t is an unsigned int which you told to avoid if you can.

    • nascardriver

      > Code is missing the library
      Added.

      > why are we using std::size_t
      I thought `std::count_if` returned an `std::size_t`. It returns a `std::ptrdiff_t` in this case. Code updated to use `auto`.

      > As I understand size_t is an unsigned int which you told to avoid if you can.
      It's an unsigned integer, but not necessarily an unsigned int. If you don't modify an unsigned integer and don't use it for arithmetic, there's nothing that can go wrong. If `std::count_if` returned a `std::size_t`, we would've needed an extra cast when all we want to do is print the result.

      Thanks!

  • Ged

    It turns out that when the compiler sees the subscript operator ([]), it actually translates that into a pointer addition and dereference! Generalizing, array[n] is the same as *(array + n), where n is an integer. The subscript operator [] is there both to look nice and for ease of use (so you don’t have to remember the parenthesis).

    Isn't [] used to save a changed value which *() can't do?

  • Jack Overby

    I'm trying to loop through and perform a regex check on each character, rather than switch-10 different cases:

    However, I keep getting the following compiler error message:

    no instance of overloaded function "std::regex_match" matches the argument list

    Any suggestions?

    • nascardriver

      Regex shouldn't be used for this. It has a huge overhead compared to manually solving the task.
      You'll get the best results from using a `switch`. I understand you don't want to do that.
      You can use an `std::set` and `std::count_if` instead.

      If you really wanted to use regex, which you shouldn't, you could use an `std::sregex_iterator` and `std::distance`. `std::sregex_iterator` iterates over all found matches.

  • alfonso

    Here the char array name does not decay or std::cout treats it in a special way.

    And maybe for the same reason, the following code gives me strange results:

Leave a Comment

Put all code inside code tags: [code]your code here[/code]