Search

6.17 — Introduction to iterators

Iterating through an array (or other structure) of data is quite a common thing to do in programming. And so far, we’ve covered many different ways to do so: with loops and an index (for-loops and while loops), with pointers and pointer arithmetic, and with range-based for-loops:

Looping using indexes is more typing than needed if we only use the index to access elements. It also only works if the container (e.g. the array) provides direct access to elements (which arrays do, but some other types of containers, such as lists, do not).

Looping with pointers and pointer arithmetic is verbose, and can be confusing to readers who don’t know the rules of pointer arithmetic. Pointer arithmetic also only works if elements are consecutive in memory (which is true for arrays, but not true for other types of containers, such as lists, trees, and maps).

For advanced readers

Pointers (without pointer arithmetic) can also be used to iterate through some non-sequential structures. In a linked list, each element is connected to the prior element by a pointer. We can iterate through the list by following the chain of pointers.

Range-based for-loops are a little more interesting, as the mechanism for iterating through our container is hidden -- and yet, they still work for all kinds of different structures (arrays, lists, trees, maps, etc…). How do these work? They use iterators.

Iterators

An iterator is an object designed to traverse through a container (e.g. the values in an array, or the characters in a string), providing access to each element along the way.

A container may provide different kinds of iterators. For example, an array container might offer a forwards iterator that walks through the array in forward order, and a reverse iterator that walks through the array in reverse order.

Once the appropriate type of iterator is created, the programmer can then use the interface provided by the iterator to traverse and access elements without having to worry about what kind of traversal is being done or how the data is being stored in the container. And because C++ iterators typically use the same interface for traversal (operator++ to move to the next element) and access (operator* to access the current element), we can iterate through a wide variety of different container types using a consistent method.

Pointers as an iterator

The simplest kind of iterator is a pointer, which (using pointer arithmetic) works for data stored sequentially in memory. Let’s revisit a simple array traversal using a pointer and pointer arithmetic:

Output:

0 1 2 3 4 5 6

In the above, we defined two variables: begin (which points to the beginning of our container), and end (which marks an end point). For arrays, the end marker is typically the place in memory where the last element would be if the container contained one more element.

The pointer then iterates between begin and end, and the current element can be accessed by dereferencing the pointer.

Warning

You might be tempted to calculate the end marker using the address-of operator and array syntax like so:

But this causes undefined behavior, because array[std::size(array)] accesses an element that is off the end of the array.

Instead, use:

Standard library iterators

Iterating is such a common operation that all standard library containers offer direct support for iteration. Instead of calculating our own begin and end points, we can simply ask the container for the begin and end points via functions conveniently named begin() and end():

This prints:

1 2 3

The iterator header also contains two generic functions (std::begin and std::end) that can be used:

This also prints:

1 2 3

Don’t worry about the types of the iterators for now, we’ll re-visit iterators in a later chapter. The important thing is that the iterator takes care of the details of iterating through the container. All we need are four things: the begin point, the end point, operator++ to move the iterator to the next element (or the end), and operator* to get the value of the current element.

Back to range-based for loops

All types that have begin and end member functions or can be used with std::begin and std::end are usable in range-based for-loops.

Behind the scenes, the range-based for-loop calls begin() and end() of the type to iterate over. std::array has begin and end member functions, so we can use it in a range-based loop. C-style fixed arrays can be used with std::begin and std::end functions, so we can loop through them with a range-based loop as well. Dynamic arrays don’t work though, because there is no std::end function for them (because the type information doesn’t contain the array’s length).

You’ll learn how to add functions to your types later, so that they can be used with range-based for-loops too.

Range-based for-loops aren’t the only thing that makes use of iterators. They’re also used in std::sort and other algorithms. Now that you know what they are, you’ll notice they’re used quite a bit in the standard library.

Iterator invalidation (dangling iterators)

Much like pointers and references, iterators can be left “dangling” if the elements being iterated over change address or are destroyed. When this happens, we say the iterator has been invalidated. Accessing an invalidated iterator produces undefined behavior.

Some operations that modify containers (such as adding an element to a std::vector) can have the side effect of causing the elements in the container to change addresses. When this happens, existing iterators to those elements will be invalidated. Good C++ reference documentation should note which container operations may or will invalidate iterators. As an example, see the “Iterator invalidation” section of std::vector on cppreference.

Here’s an example of this:


6.18 -- Introduction to standard library algorithms
Index
6.16 -- An introduction to std::vector

(h/t to nascardriver for significant contributions to this lesson)

65 comments to 6.17 — Introduction to iterators

  • ammawn

    The first code example uses

    but does not include

    (which we've seen before). Or any other header where it is defined.

    Defined in header

            
    Defined in header

            
    Defined in header

            
    Defined in header

            
    Defined in header

            
    Defined in header

            
    (since C++17)
    Defined in header

    • nascardriver

      `std::size_t` also comes from <array> -> <initializer_list>, that's why the code was working. Anyway, <cstddef> should be included, I updated the lesson. Thanks!

  • Question

    >>Range-based for-loops aren’t the only thing that makes use of iterators. THEY ARE also used in std::sort and other algorithms.

    What does 'they' refer to? 'Ranged-based for-loops'? or 'iterators'?

    >> First, when we resized the vector, the existing element values were preserved!

    Does't it mean that they could be in the same memory address, does it?

    • nascardriver

      iterators

      • Question

        Thank you do much.  

        >>First, when we resized the vector, the existing element values were preserved!

        does it mean that the elements could have the same memory address after resizing? In that case, Iterator invalidation  won't happen. right?

        • nascardriver

          Where's that quote from? I'm missing the context to answer your question.

          • Question

            It is quoted from the previous section, std::vector.

            I asked this question because in this chapter you said, "Some operations that modify containers (such as adding an element to a std::vector) can have the side effect of causing the elements in the container to change addresses."

            Now, I am asking if one element is added to the vector as long as the address of the previous elements weren't changed, does still we have "iterator invalidated"?

            • nascardriver

              If the addresses didn't change, that means the vector didn't have to perform a reallocation. If no reallocation happened and you're growing the vector, all iterators (except the end iterator) are still valid. cppreference lists the reasons for iterator invalidation.

  • saMIra

    >>You might be tempted to calculate the end marker using the address-of operator and array syntax like so:

    Can we use this?

    Also, I realized that unlike fixed-size arrays, in std::array, the name of the array doesn't have the address of the first element of the array, why is that in std::array?

    • nascardriver

      > Can we use this?
      No, there are containers (Pretty much everything except built-in arrays) which can have a size of 0.

      `std::array` is a struct.

  • J

    Awesome section on iterators. Iterators confused me for a long time before I figured them out wish this lesson was here a few years ago. By the way really enjoying the new updates I used this site exclusively in the past to learn not only C++ but programing in general thanks Alex and Nascardriver for the lesson and updates. PS section on C++ reference is so true why I appreciate your site so much.

  • Alek

    hey,why can't I omit  the type while I wanna define this vector:

    any vector more than 3 objects I cannot omit it why is that ?

    • nascardriver

      The type can be omitted since C++17. Make sure you set up your compiler properly (Lesson 0.12).

      • Alek

        the point is that my language standard is exactly c++17, yet it doesn't work kind of strange....

        • nascardriver

          See this table -> C++17 core language features -> Class template argument deduction
          It shows the minimum required version of the major compilers. If you compiler is older than the one listed, you should update your compiler.
          If you are using at least the compiler listed, please post your full code and error message.

          • Alek

            How come there is no visual studio in the table ? I'm using VS 2017,
            the error is :no instance of constructor std::vector matches the argument list.
            this is my code :

            • nascardriver

              Visual Studio's compiler is msvc. I don't know how to figure out which version of msvc your VS installation is using, you'll have to look that up. Telling from the versions alone, I'd guess msvc 19 ships with VS 2019, not with VS 2017.

  • Mohammad

    Hello, NascarDriver. I write my codes,using std::size but there is a compile error. Would you please help me in this case? I instead used sizeof operator but that resulted in undefined behavior.Would you kindly write a program on using std::size and help me out? Actually, I wrote the following Program and tried to run:

    But my another question was What is the point of writing name+length as condition for ptr in the for loop?
    Sorry to disturb you, but advanced thanks.

    • nascardriver

      Your code works (But is missing an include to <iterator>). Without an error message, I can't help you.

      See lesson P.6.8a for an explanation for `name + length`.

  • Thomas Kennings

    "The simplest kind of iterator is a pointer, which (using pointer arithmetic) works for data store sequentially in memory."

    Store should be stored! Thanks for all the awesome lessons!

  • Yuri Khramov

    #include <array>
    #include <iostream>

    int main()
    {
        std::array data{ 0, 1, 2, 3, 4, 5, 6 };

        auto begin{ &data[0] };
        // note that this points to one spot beyond the last element
        auto end{ begin + std::size(data) };

        // for-loop with pointer
        for (auto ptr{ begin }; ptr != end; ++ptr) // ++ to move to next element
        {
            std::cout << *ptr << ' '; // dereference to get value of current element
        }
        std::cout << '\n';

        return 0;
    }
    Error:(6)argument list for class template is missing

  • mesut

    Is there a difference between lines A and B?

  • Jimmy Hunter

    Typo = side of effect
    Correction = side effect

  • ruchika malhotra

    Hi , I have one doubt
    You might be tempted to calculate the end marker using the address-of operator and array syntax like so:

    But this causes undefined behavior, because array[std::size(array)] accesses an element that is off the end of the array.
    Instead, use:

    when we print the values both of them prints 0 i.e. element that is off the end of the array.

    • nascardriver

      Accessing `end` causes undefined behavior. It's not a real element.

      The problem with the first version is that you're already accessing the element in the calculation of `end`

      The second version doesn't access the element, it only calculates where it would be.

  • marius

    "All types that have begin and end member functions or can be used with std::begin and std::end are usable in range-based for-loops.
    Behind the scenes, the range-based for-loop calls begin() and end() of the type to iterate over. std::array has begin and end member functions, so we can use it in a range-based loop."

    Like that?

    • nascardriver

      Yes, but you're invoking undefined behavior by using pointer arithmetic on your class members. This is not allowed:

      If you update `Arr` to use an array internally, everything is fine. That would make the implementation trivial:

      If you want to iterate over distinct members, you'll have to create a custom iterator. We can use a vector and store pointers to the elements we want to iterator over. This requires features we haven't covered up to this point, but I'll leave it here in case you want to come back after chapter 9.

  • salah

    Dynamic arrays don’t work though, because there is no std::end function for them (because the type information doesn’t contain the array’s length.

    Could you illustrate this point more please .

    • nascardriver

      When you use a dynamically allocated array, in almost every situation it will decay to a pointer. A pointer doesn't know the length of the array it's pointing to, it doesn't even know that it's pointing to an array. Just like you can't use `sizeof` on a dynamically allocated array, you can't use `std::size`.

  • Oscar

    Hi, I have some questions:

    1. I copied the following code from the second program (line 8) in the tutorial:

    


    The tutorial says the simplest kind of iterator is a pointer. I wonder why we didn’t use `auto*`, but `auto` instead? Does `auto` handle `auto*` already? FYI, the quiz solution from 6.9a uses `auto*` when allocating an array.


    2. The tutorial says “For arrays, the end marker is typically the place in memory where the last element would be if the container contained one more element.” 

I thought the end marker is the place in memory where the “element after the last element” would be?


    3. The cppreference link about “Iterator invalidation” is really helpful. But I didn’t seem to find the following operation documented in it:

    

cppreference says `operator=` and `assign`, will always cause iterator invalidation:

    



    But I think these two are different from the operation I’m talking about. Could you help shed some light on whether `myVector[2] = 5;` causes iterator invalidation?

    • nascardriver

      1. `auto` handles `auto*`. Even though `auto` handles `auto*`, I recommend using `auto*` for clarity if the object is a pointer. This lesson uses `auto` rather than `auto*`, because it's more important that `begin` is an iterator than that it's a pointer. `&data[0]` could be replaced with another iterator and the code would still work. This can't be done if we used `auto*`.

      2. The two sentences say the same. "The element after the last element" and "The last element if the array had one more element".

      3. The assignment in this example doesn't affect the vector. The only vector operation in this example is `operator[]`. The vector isn't affected by what you do with the individual elements. Think of it like this

      • Oscar

        Thanks for the detailed reply!

        1. That's interesting. Could you elaborate a bit more why using `auto*` will disallow us to replace `&data[0]` with another iterator? Does it mean:

        2. Oh indeed! Very sorry for my misinterpretation!

        3. Thanks for the detailed explanation! It really helped a lot :D.

        • nascardriver

          1. Not replace as in assign or override, but update our code to use a different iterator type. Not all iterators are pointers. `auto*` only works for pointers. For example, using a linked list, which can't be iterated with a plain pointer:

          We don't care if `begin` is a pointer or not. We only need it to be an iterator. With `auto*`, we wouldn't be able to use just "some iterator".

          I like your questions, they go into details of the language, keep them up :)

          • Oscar

            Thanks for your detailed reply and encouragement!! I just happened to edit my original question. I removed the edited part and pasted it here:

            I think I misunderstood iterators. Like what you've said, cplusplus.com says not all iterators have the same functionality of pointers. Also, I realized iterators have their own type `std::iterator`, which is different from a pointer. So it's okay to use `auto*` if we intend to use pointers as iterators. But if we use `data1.begin()` (which returns std::iterator), we should use `auto`. Am I correct?

            I think if possible, it'd be helpful if some of the `auto` keywords in the examples are replaced with explicit types (especially when they first appeared in the code), just to make everything clear! Or perhaps adding some comments about the variable type.

            • nascardriver

              There is not iterator type per se. The only requirements for an iterator are that you can use `++it` and `*it`. A pointer fulfills these requirements, so we can use a pointer as an iterator. `std::iterator` was deprecated in C++17 and will likely be removed in the future. I can't see this being mentioned on cplusplus, so I'll recommend cppreference, where this is obvious https://en.cppreference.com/w/cpp/iterator/iterator .
              The good thing about iterators is that we don't have to care about their type or implementation. As long as they support `++it` and `*it`, which they must, we can iterate through a container. That's why the lesson uses `auto` and not `auto*`. We don't care that it's a pointer, it's some iterator, and `auto` can hold every iterator, not just pointers. Rather than using `auto*` because it's a pointer, we use `auto` to make the code work with all iterators.
              If you read through documentations, you'll see that the iterators aren't given by type, but categorized as named requirements (eg. "LegacyIterator", "LegacyOutputIterator", ...). Again, this is because the type doesn't matter. It's only important which operations are supported by the iterators. There are different kinds of iterators, some with more features than others (eg. iterating backwards or begin comparable).

              • Oscar

                Wow... thanks for correcting me again and again! The detailed explanation is superb and I really appreciate it!!

                I'll keep these in mind:
                Iterators aren't given by type, and type doesn't matter (this is why we use `auto`). What matters are the operations supported by the iterators. The only requirements for an iterator are that we can use `++it` and `*it`. `std::iterator` is deprecated in C++17 (gotta be careful when reading the docs and refer to cppreference next time!).

  • Omran

    hello!! , i know this questions out of this tutorial's context but Nascardriver are you a real nascar driver , sorry if this question was bothering anyone !! , i'm just curious , :)

    • nascardriver

      Nope, I drive a normal car, also not in circles :)

      • Omran

        hahahahahahahahahahahahaha cuz i was just reading something about centripetal acceleration so they covered some examples about nascar race or something :D , love you bro , you're helping a lot of c++ beginners , you and Alex :)

  • kavin

    Hi, why am i not able to do this?

    what does array.begin() actually give? If it's address cout'ing begin should print address of 1st element right ?

    • nascardriver

      It's an iterator, you can't print it. You can access the element the iterator is referring to via `*begin` or `*array.begin()`.
      We haven't covered user-defined types or operator overloading yet, you'll understand how this works in chapter 9.

  • hellmet

    I think it's important to note that since we cannot use 'const auto&' in the range-based-for loop [not the 'for (int i : array)'; since we need to do ++iter], any modification to the underlying datastructure invalidates the iterator (I hope I got that right, from what I read online).

    • nascardriver

      I don't understand what you mean, can you provide an example of a range-based for-loop in which we can't use `const auto&`?

      Whether or not a modification to the data invalidates iterators depends on the container. In an `std::array`, you can modify all you want without invalidating iterators. `std::vector` on the other hand might perform reallocation, which invalidates iterators.

      • hellmet

        I meant to say, one can't do this.
        Also, mistake on my part! This is not a 'range-based for-loop'.

        Oh right, the invalidation depends on the container! I meant perhaps adding the point here is invaluable information!

        • nascardriver

          I added a note about the `const` iterator variable. Container invalidation doesn't fit this lesson, I'll try to remember it and add it to a lesson about algorithms or containers if it's not there already.

          • hellmet

            I'm glad I could be of help!
            Thanks!

          • Alex

            I do think iterator invalidation is a valid topic to be mentioned here. I added a section to the bottom of the lesson, including an example of invalidation that uses only topics that have been covered so far. Let me know if you think it doesn't fit in for some reason.

            • Matt

              Alex,

              As a matter of fact, I just spent the better part of 3 hours debugging code (not related to these tutorials) only to realize that I was calling std::map<>::erase() inside of a range-based for loop which was causing my program to throw exceptions I'd never seen before.

              Funny thing too, it was a hard one for me to debug because the erase() is only called on a key or mouse button release (the map tracks key/button presses for a game I'm writing) - so triggering that code from within the debugger was difficult (had to put breakpoints all throughout the various portions of the map handling code and run the game again to finally find where it was diving off into the deep end).

              Writing this to let you know that that section definitely does fit, and moreover, I think an example of a proper way to handle the erase should accompany your example of how not to handle the erase.

              Thanks for your amazing site as always,
              Matt

  • Eddie Ball

    Loving the new content! But I'm a little confused about what makes an iterator. In the section "Iterators" you say:

    "An iterator is an object designed to iterate through a container without having to care about the internal structure of the data."

    and then give an example of an iterator that uses pointers. How is is this iterator indifferent to the internal structure of the data? Is it because it uses pointers and the auto data type?

    Thanks!

    • Alex

      That sentence was poorly written. To clarify: iterators themselves are tightly coupled to the data they're iterating over (the iterator has to care about the internal data structure to know how to move through it). However, once an iterator is created, the user of the iterator only has to use the iterator's interface to navigate and access the elements, and is thus abstracted from how the data is actually stored.

      I've amended the lesson wording a bit to try and clarify.

  • siva

    Hi,

    Can you please explain the difference between std::size and array::size ?

    Regards,
    Siva.

  • Jim

    std::size(data) works but why not data.size()?

    • nascardriver

      `std::size` works for C-style arrays too. There are none in this lesson, I think this is a leftover from when the lesson used C-style arrays. `std::size` is fine too, so I'll leave it there.

  • Louis Cloete

    Great explanation! Just one suggestion: won't it make more sense / be clearer to initialize end like this:

    ?

  • swap145

    Thank you for this new section.

Leave a Comment

Put all code inside code tags: [code]your code here[/code]