Pointer arithmetic
The C++ language allows you to perform integer addition or subtraction operations on pointers. If ptr
points to an integer, ptr + 1
is the address of the next integer in memory after ptr. ptr - 1
is the address of the previous integer before ptr
.
Note that ptr + 1
does not return the memory address after ptr
, but the memory address of the next object of the type that ptr
points to. If ptr
points to an integer (assuming 4 bytes), ptr + 3
means 3 integers (12 bytes) after ptr
. If ptr
points to a char
, which is always 1 byte, ptr + 3
means 3 chars (3 bytes) after ptr.
When calculating the result of a pointer arithmetic expression, the compiler always multiplies the integer operand by the size of the object being pointed to. This is called scaling.
Consider the following program:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
#include <iostream> int main() { int value{ 7 }; int *ptr{ &value }; std::cout << ptr << '\n'; std::cout << ptr+1 << '\n'; std::cout << ptr+2 << '\n'; std::cout << ptr+3 << '\n'; return 0; } |
On the author’s machine, this output:
0012FF7C 0012FF80 0012FF84 0012FF88
As you can see, each of these addresses differs by 4 (7C + 4 = 80 in hexadecimal). This is because an integer is 4 bytes on the author’s machine.
The same program using short
instead of int
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
#include <iostream> int main() { short value{ 7 }; short *ptr{ &value }; std::cout << ptr << '\n'; std::cout << ptr+1 << '\n'; std::cout << ptr+2 << '\n'; std::cout << ptr+3 << '\n'; return 0; } |
On the author’s machine, this output:
0012FF7C 0012FF7E 0012FF80 0012FF82
Because a short is 2 bytes, each address differs by 2.
Arrays are laid out sequentially in memory
By using the address-of operator (&), we can determine that arrays are laid out sequentially in memory. That is, elements 0, 1, 2, … are all adjacent to each other, in order.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
#include <iostream> int main() { int array[]{ 9, 7, 5, 3, 1 }; std::cout << "Element 0 is at address: " << &array[0] << '\n'; std::cout << "Element 1 is at address: " << &array[1] << '\n'; std::cout << "Element 2 is at address: " << &array[2] << '\n'; std::cout << "Element 3 is at address: " << &array[3] << '\n'; return 0; } |
On the author’s machine, this printed:
Element 0 is at address: 0041FE9C Element 1 is at address: 0041FEA0 Element 2 is at address: 0041FEA4 Element 3 is at address: 0041FEA8
Note that each of these memory addresses is 4 bytes apart, which is the size of an integer on the author’s machine.
Pointer arithmetic, arrays, and the magic behind indexing
In the section above, you learned that arrays are laid out in memory sequentially.
In the previous lesson, you learned that a fixed array can decay into a pointer that points to the first element (element 0) of the array.
Also in a section above, you learned that adding 1 to a pointer returns the memory address of the next object of that type in memory.
Therefore, we might conclude that adding 1 to an array should point to the second element (element 1) of the array. We can verify experimentally that this is true:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
#include <iostream> int main() { int array []{ 9, 7, 5, 3, 1 }; std::cout << &array[1] << '\n'; // print memory address of array element 1 std::cout << array+1 << '\n'; // print memory address of array pointer + 1 std::cout << array[1] << '\n'; // prints 7 std::cout << *(array+1) << '\n'; // prints 7 (note the parenthesis required here) return 0; } |
Note that when performing indirection through the result of pointer arithmetic, parenthesis are necessary to ensure the operator precedence is correct, since operator * has higher precedence than operator +.
On the author’s machine, this printed:
0017FB80 0017FB80 7 7
It turns out that when the compiler sees the subscript operator ([]), it actually translates that into a pointer addition and indirection! Generalizing, array[n]
is the same as *(array + n)
, where n is an integer. The subscript operator [] is there both to look nice and for ease of use (so you don’t have to remember the parenthesis).
Using a pointer to iterate through an array
We can use a pointer and pointer arithmetic to loop through an array. Although not commonly done this way (using subscripts is generally easier to read and less error prone), the following example goes to show it is possible:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
#include <iostream> #include <iterator> // for std::size bool isVowel(char ch) { switch (ch) { case 'A': case 'a': case 'E': case 'e': case 'I': case 'i': case 'O': case 'o': case 'U': case 'u': return true; default: return false; } } int main() { char name[]{ "Mollie" }; int arrayLength{ static_cast<int>(std::size(name)) }; int numVowels{ 0 }; for (char *ptr{ name }; ptr < (name + arrayLength); ++ptr) { if (isVowel(*ptr)) { ++numVowels; } } std::cout << name << " has " << numVowels << " vowels.\n"; return 0; } |
How does it work? This program uses a pointer to step through each of the elements in an array. Remember that arrays decay to pointers to the first element of the array. So by assigning ptr
to name, ptr
will also point to the first element of the array. Indirection through ptr
is performed for each element when we call isVowel(*ptr)
, and if the element is a vowel, numVowels
is incremented. Then the for loop uses the ++ operator to advance the pointer to the next character in the array. The for loop terminates when all characters have been examined.
The above program produces the result:
Mollie has 3 vowels
Because counting elements is common, the algorithms library offers std::count_if
, which counts elements that fulfill a condition. We can replace the for
-loop with a call to std::count_if
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
#include <algorithm> #include <iostream> #include <iterator> // for std::begin and std::end bool isVowel(char ch) { switch (ch) { case 'A': case 'a': case 'E': case 'e': case 'I': case 'i': case 'O': case 'o': case 'U': case 'u': return true; default: return false; } } int main() { char name[]{ "Mollie" }; auto numVowels{ std::count_if(std::begin(name), std::end(name), isVowel) }; std::cout << name << " has " << numVowels << " vowels.\n"; return 0; } |
std::begin
returns an iterator (pointer) to the first element, while std::end
returns an iterator to the element that would be one after the last. The iterator returned by std::end
is only used as a marker, accessing it causes undefined behavior, because it doesn’t point to a real element.
std::begin
and std::end
only work on arrays with a known size. If the array decayed to a pointer, we can calculate begin and end manually.
1 2 3 4 5 |
// nameLength is the number of elements in the array. std::count_if(name, name + nameLength, isVowel) // Don't do this. Accessing invalid indexes causes undefined behavior. // std::count_if(name, &name[nameLength], isVowel) |
Note that we’re calculating name + nameLength
, not name + nameLength - 1
, because we don’t want the last element, but the pseudo-element one past the last.
Calculating begin and end of an array like this works for all algorithms that need a begin and end argument.
Quiz time
Question #1
Why does the following code work?
1 2 3 4 5 6 7 8 9 10 |
#include <iostream> int main() { int arr[]{ 1, 2, 3 }; std::cout << 2[arr] << '\n'; return 0; } |
Question #2
Write a function named
find
that takes a pointer to the beginning and a pointer to the end (1 element past the last) of an array, as well as a value. The function should search for the given value and return a pointer to the first element with that value, or the end pointer if no element was found. The following program should run:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
#include <iostream> #include <iterator> // ... int main() { int arr[]{ 2, 5, 4, 10, 8, 20, 16, 40 }; // Search for the first element with value 20. int *found{ find(std::begin(arr), std::end(arr), 20) }; // If an element with value 20 was found, print it. if (found != std::end(arr)) { std::cout << *found << '\n'; } return 0; } |
Tip
std::begin
and std::end
return an int*
. The call to find
is equivalent to
1 |
int *found{ find(arr, arr + std::size(arr), 20) }; |
![]() |
![]() |
![]() |
I threw up some sample code to help me understand indexing and sorting better:
My code doesn't sort the last element of the array in this instance. Why is it the case that we would use:
std::sort(Array , (Array + ArraySize));
instead of:
std::sort(Array , (Array + (ArraySize - 1)));
isn't the first instance more than the number of elements in the array? I am not understanding why it works.
thanks!
Algorithms want to have the one-past-the-end element. The algorithms will never access this element, it's only used as a marker.
The reason for this is that not all containers can quickly access their last element (eg. a linked list). Finding the last element could take a long time depending on the data, but creating an end marker is a constant time operation.
Hello, I'm really having a hard time understanding this part of your code
The part I actually don't understand is this:
How does it work actually?
`name + 0` returns the first character.
`name + 1` returns the second character.
...
`name + arrayLength` returns the one-past-the-end character.
Don't you mean 'name+0' returns the first character address?
Yes, thanks! `name + 0` returns a pointer to the first character. Same for my other examples.
What can go wrong if i return back begin pointer? I mean if the number is not found, begin == end right?
There's no difference, other than that it might take a little longer to understand why you're returning `begin`.
Hi @nascardriver. I've completed Question 2 like this:
"begin" in this case would be the same as "int *ptr[begin]", or am I wrong? Since begin is an array, and passed into a function, it decays into a pointer that points at the first element of the array. Have I gotten anything wrong?
Thanks!
You can use `begin` like that. I don't like doing it, because once you increment `begin`, the name "begin" is no longer accurate.
`begin;` in line 3 doesn't do anything. If you don't want to declare anything, leave the init-statement empty.
If the body of something exceeds 1 line (and even if it doesn't, I suggest doing so), wrap it in curly braces. Together with your missing indentation, your loop body is misleading.
I dont understand the mechanism of the below code:
if:
needs 1 argument,
but the below statement showing no parameter for the isVowel function call:
`isVowel` is not a call. We're passing the `isVowel` function itself to `std::count_if` and `std::count_if` later calls it with an argument.
Thank you, actually I just found the explanation about this at chapter 6.18. it is function pointer or lambda.
Oh my! The joy I felt when my code for Q2 matched up EXACTLY to what you'd written. Thanks! I'm having a great time with this!
This is the solution that you presented for quiz 2!
why you are using p != end instead of p < end ?
As long as we're using pointers, we could use `p < end`. However, the solution shown in the quiz works for all iterators (Iterators are covered later. Pointers into arrays are iterators). Not all iterators support `p < end`, but almost all iterators support `p != end`.
Great thank you for the clarification! I am looking forward to read the topic on iterators.
To summarize this lesson in my own words. Please let me know if i'm wrong.
When performing operations on non-derefenced pointers, literals are implicitly converted to their size in bytes and multiplied by the literal itself. Basicly the same as using the sizeof() funcion and multiplying that result by the value passed in as an argument. These values are also treated as hex values.
You got the first part right.
Hexadecimal is just a representation, it doesn't change the value of anything and "a hex value" has no meaning. It doesn't matter if you call your friend Joseph or Joe, it's still the same person.
Why do you recommend using *ptr on an earlier lesson but then use int* ptr in solution to question #2?
If I remember right, the recommendation was to use *ptr function_name, but with variables to use int *variable_name.
Hi!
I wrote the examples with `type* name` syntax, you'll see more of them. I'm starting to update these examples to use `type *name` as well.
hey!,I got a question about this part :
isVowel is a function yet you are kind of calling it here like :isVowel ? no paranthesis"()" it expects a parameter and we're not giving it any how does it know to pass each elemnts of begin to it ?I struggled too much but it doesn't seem I cant get,also I checked cppreference which had scarrier exp xD here it is :
can you tell me what's happening in the count_if part ? the third parameter isn't vivid for me "[](int i)"? what is this,never seen such a syntax.
can you pls tell me the difference between "count"& "count_if" ? count seems like to work like the count_if you utilized above.
thanks in advance!
When a function is used without parentheses() it's a function pointer. We cover function pointers later. `std::count_if` is calling `isVowel` with each element (character) of `name`. The example on cppreference uses a lambda (anonymous function). We show lambdas in chapter 7.
In the std::count_if example I get an error using 'auto' for the numVowels. Instead, the type has to be 'long'. Compiler: Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)
What's the error message? If you're still not using C++17, try using direct initialization
`auto` didn't work well with list initialization before C++17.
Quiz 2
I recommend stepping through with break points with this one. It's interesting to see
how it works at a low level! :)
Question 2:
Have I misunderstood? "....The function should search for the given value and
return a pointer to the first element with that value,
or the end pointer if no element was found....."
The solution as written returns the value 20 not a pointer to the element hence I finished up with:-
The solution returns a pointer to the element with value 20, because that element existed. If there was no element with value 20, it'd return `end`. Have another look at the solution, we're returning an `int*`, not an `int`.
Being pedantic the question asks for '..return a pointer to the first element with that value.' not the value of the element which was the reason for my confusion. I understood the solution as written. Thanks.
Can I also add here what a godsend this site has been over the foregoing few weeks. Being basically housebound because of this blasted virus it is a great help to have some 'mind fodder'. Thank you.
Exactly, you wrote in your original comment that "The solution as written returns the value 20". But that's not true. The quiz didn't ask to return a value and it doesn't return a value. It always talks about a pointer. The pointer points to an element with the value 20.
Hi Nascar and Alex, I have a problem with the cout. I can't extract the address for each element of the array.
I tried with the printf and as you can see it works, where am I doing wrong?
Output:
name address [0]: 002DFC84 , M
name address [0]: Mollie , M
name address [1]: 002DFC85 , o
name address [1]: ollie , o
name address [2]: 002DFC86 , l
name address [2]: llie , l
name address [3]: 002DFC87 , l
name address [3]: lie , l
name address [4]: 002DFC88 , i
name address [4]: ie , i
name address [5]: 002DFC89 , e
name address [5]: e , e
P.S. I checked arrayLength and it's 7 , because it's Mollie + \0, shouldn't it be arrayLength -1 in the for?
from 0 to 5 it checked all the letters, to 6 it checks \0, is this step necessary?
I tried with arrayLength-1 and it works so I'm a little confused in this part.
Thank you in advance
You don't need to print the zero-terminator. You can ignore it or use it as your loop's condition.
You're printing
The type of `name` is `char*` (A string).
The type of `name[i]` is `char`.
The type of `&name[i]` is `char*` (A string).
If you don't want `std::cout` to print a string, cast `&name[i]` to a `void*` first. We cover `void*` later in this chapter.
My answer for question 2.
what's the problem in this code
- You're `using namespace`
- You're not using list initialization
- Your variables aren't named descriptively
- You're using postfix++ instead of ++prefix
- You're using `std::endl` when you don't need it
- Your warning level is too low or you're ignoring compiler warnings (Lesson 0.10, 0.11)
- Your code is inconsistently formatted. Use an auto-formatter to save time
Hi, I did not understand this statament well could you illustrate it please
When calculating the result of a pointer arithmetic expression, the compiler always multiplies the integer operand by the size of the object being pointed to. This is called scaling.
Basically, if you have a 32-bit integer (4 bytes), when you advance an int pointer by 2, it will multiple 2 by 4 to get 8. That 8 is the number of memory addresses it needs to move forward.
is
This makes more sense than just adding 2 to the address, because you'd end up somewhere inside the object and you'd read junk. The automatic multiplication by the object's size makes it so that you get to the next object instead.
Here is my solution for Problem #2 in the quiz.
Hi, good job!
When you know how often your loop is going to run, a `for`-loop is usually the better choice.
Hello nascardriver, I am trying to make sure I fully understand how to go about Quiz #2 in a very basic way, but I ran into a problem which quite frankly should not be happening. Am I missing something here?
Unfortunately, this crashes the program and results in the following error:
Process returned -1073741819 (0xC0000005)
Going over the lesson, I do not understand what I could have done wrong here. I hope you can help me :)
I now see and have resolved my own error. As I was accessing elements in array "arr[8]", I did not take into account the issue caused by accessing and copying out-of-bounds addresses, such as "arr + arrLength", and that it will yield undefined behavior, causing a segmentation fault error... sometimes. Initially I had thought to assign pointer "found" the address that immediately follows the last address in the array, and to compare as to whether "found" held this address later on. But I realized that this was completely unnecessary, and that I could instead in line 24 I should have used the boolean comparison functionality of null pointers, rather than arbitrarily choosing to set "found" to an out-out-bounds address, and test for that address....
My fixed code, with no SIGSEGV Segmentation Fault error:
hi!
Setting `found` to the end (ie. `arr + arrLength`) is not wrong. That's what the standard library does too to indicate that whatever it searched for was not found. The issue in your code was the wrong comparison in line 15, which you already fixed.
Judging by your code, you know C or you started learning C++ elsewhere. You're writing C++98, doing so will give you a harder time later on. Here are some topics you should catch up on:
- Brace initialization
- `std::size`
- `nullptr`
Do we have any reasons to stick asterisk to int instead of begin and end in Solution to Quiz #2?
Shouldn't it follow your previous advice "When declaring a pointer variable, put the asterisk next to the variable name"?
Thank you!
Hello, Nascardriver driver I don't really understand how std::end works cause on my machine when I tried to use this code end is given some nonsense value why is that and what does std::end do shouldn't it just take the last element of the array-like std::begin dose with the first element or is something in my code messing it up.
`std::end` returns the element that would be at arr[length(arr)], but it's not a real element, accessing it causes undefined behavior.
Your `while`-loop is wrong. If `begin == end` and you increase `begin`, you'll never terminate. It should be
Note also the change of order. If `begin == end` and you access `*begin`, you're invoking undefined behavior.
You don't need the `if`-statement after that, it doesn't do anything.
thank you so much nascardriver I did not understand what std::end did thank you very much for clearing it up.
For quiz 2 i did this.
I saw your solution. Why are you initializing begin to pointer p? we already know begin is a pointer right?can't we use begin as a address directly? and can i use begin < end , since we know end is " arr + std::size(arr) " and compare 2 addresses to loop through them?
> Why are you initializing begin to pointer p?
Modifying arguments makes code harder to understand. Once you modify `begin`, it's name is no longer correct, it's not pointing to the beginning of the data.
> can i use begin < end Yes, there's nothing wrong with it. Using != is more portable, you'll learn more about it in the lesson about iterators.
Oh ok, thank you. Now i get it. Since we were using variable values in the called function in previous chapters without initializing them to other variable , i thought i could apply it here. So for pointers the best practice is not modify the parameter and initialize them to another pointer and modify that ?
And i have another common doubt for quite sometime.I forgot to ask you in previous chapters. Why do u use "return end; " outside of loop? Can't you put an else and use like this,
Modifying parameters is never good if it changes the meaning of the variable.
Your suggested update doesn't work. The function always returns in the first iteration.
>The function always returns in the first iteration<
Oh yes ! I couldn't figure out this issue small issue till now :( Thank you @nascardriver.
Here's the code to my solution...
Pretty much exactly the same as the solution, good job!
[code]
std::count_if(name, name + nameLength, isVowel)
[code]
What type of functions are compatible with std::count_if? Something similar to bool isVowel(char ), I am guessing that it has to accept a type to be counted and returns a bool and count_if counts the bool returned? .It is also odd and first encounter of a function being passed as argument without ().
The function has to return a `bool` and take an argument of the array element type. We have an array of `char`s, so the function has to accept a char. It could also accept types that can be created from a char, eg. an `int`.
If we had an array of `std::string`s, the the function would have to accept a `std::string`.
Algorithms will get more prominent in the tutorials. There will be a short introduction about them once it's clear in which lesson they're first used. It's not too difficult to understand how to use algorithms without knowing how they work, so they're adding them to the tutorials before we explain how they work.
It would be great to have an exercise here.
Thank your for your feedback!
I added two questions to the lesson to help future readers understanding pointer arithmetic.
in the example where it sorts through mollie to find how many vowels there are. the for loop has this condition in it ptr < (name + arrayLength) how does it add name and int? I tried printing "name + arrayLength" and i get random garbage. does name and array length convert to pointers when they are being compared to ptr?
Hi Elvis!
Your question is the main topic of this lesson, I suggest you re-read it.
`name + arrayLength` returns a pointer that points `arrayLength` bytes after `name`.
Had a bit of a brain fart.
Thank you nascar driver!
Code is missing the <algorithm> library. And why are we using std::size_t, because when I try to run it I get an error and if I change it to an int, it works. Only if I use the static_cast<std::size_t>( ) it works, but why do the extra work? As I understand size_t is an unsigned int which you told to avoid if you can.
> Code is missing the library
Added.
> why are we using std::size_t
I thought `std::count_if` returned an `std::size_t`. It returns a `std::ptrdiff_t` in this case. Code updated to use `auto`.
> As I understand size_t is an unsigned int which you told to avoid if you can.
It's an unsigned integer, but not necessarily an unsigned int. If you don't modify an unsigned integer and don't use it for arithmetic, there's nothing that can go wrong. If `std::count_if` returned a `std::size_t`, we would've needed an extra cast when all we want to do is print the result.
Thanks!
It turns out that when the compiler sees the subscript operator ([]), it actually translates that into a pointer addition and dereference! Generalizing, array[n] is the same as *(array + n), where n is an integer. The subscript operator [] is there both to look nice and for ease of use (so you don’t have to remember the parenthesis).
Isn't [] used to save a changed value which *() can't do?
Indirection (*ptr) returns a reference (covered later). References can be used to modify the value.
I'm trying to loop through and perform a regex check on each character, rather than switch-10 different cases:
However, I keep getting the following compiler error message:
no instance of overloaded function "std::regex_match" matches the argument list
Any suggestions?
Regex shouldn't be used for this. It has a huge overhead compared to manually solving the task.
You'll get the best results from using a `switch`. I understand you don't want to do that.
You can use an `std::set` and `std::count_if` instead.
If you really wanted to use regex, which you shouldn't, you could use an `std::sregex_iterator` and `std::distance`. `std::sregex_iterator` iterates over all found matches.
Here the char array name does not decay or std::cout treats it in a special way.
And maybe for the same reason, the following code gives me strange results:
std::cout treats it in a special way