Search

9.8 — Overloading the subscript operator

When working with arrays, we typically use the subscript operator ([]) to index specific elements of an array:

However, consider the following IntList class, which has a member variable that is an array:

Because the m_list member variable is private, we can not access it directly from variable list. This means we have no way to directly get or set values in the m_list array. So how do we get or put elements into our list?

Without operator overloading, the typical method would be to create access functions:

While this works, it’s not particularly user friendly. Consider the following example:

Are we setting element 2 to the value 3, or element 3 to the value 2? Without seeing the definition of setItem(), it’s simply not clear.

You could also just return the entire list and use operator[] to access the element:

While this also works, it’s syntactically odd:

Overloading operator[]

However, a better solution in this case is to overload the subscript operator ([]) to allow access to the elements of m_list. The subscript operator is one of the operators that must be overloaded as a member function. An overloaded operator[] function will always take one parameter: the subscript that the user places between the hard braces. In our IntList case, we expect the user to pass in an integer index, and we’ll return an integer value back as a result.

Now, whenever we use the subscript operator ([]) on an object of our class, the compiler will return the corresponding element from the m_list member variable! This allows us to both get and set values of m_list directly:

This is both easy syntactically and from a comprehension standpoint. When list[2] evaluates, the compiler first checks to see if there’s an overloaded operator[] function. If so, it passes the value inside the hard braces (in this case, 2) as an argument to the function.

Note that although you can provide a default value for the function parameter, actually using operator[] without a subscript inside is not considered a valid syntax, so there’s no point.

Why operator[] returns a reference

Let’s take a closer look at how list[2] = 3 evaluates. Because the subscript operator has a higher precedence than the assignment operator, list[2] evaluates first. list[2] calls operator[], which we’ve defined to return a reference to list.m_list[2]. Because operator[] is returning a reference, it returns the actual list.m_list[2] array element. Our partially evaluated expression becomes list.m_list[2] = 3, which is a straightforward integer assignment.

In the lesson a first look at variables, you learned that any value on the left hand side of an assignment statement must be an l-value (which is a variable that has an actual memory address). Because the result of operator[] can be used on the left hand side of an assignment (e.g. list[2] = 3), the return value of operator[] must be an l-value. As it turns out, references are always l-values, because you can only take a reference of variables that have memory addresses. So by returning a reference, the compiler is satisfied that we are returning an l-value.

Consider what would happen if operator[] returned an integer by value instead of by reference. list[2] would call operator[], which would return the value of list.m_list[2]. For example, if m_list[2] had the value of 6, operator[] would return the value 6. list[2] = 3 would partially evaluate to 6 = 3, which makes no sense! If you try to do this, the C++ compiler will complain:

C:VCProjectsTest.cpp(386) : error C2106: '=' : left operand must be l-value

Dealing with const objects

In the above IntList example, operator[] is non-const, and we can use it as an l-value to change the state of non-const objects. However, what if our IntList object was const? In this case, we wouldn’t be able to call the non-const version of operator[] because that would allow us to potentially change the state of a const object.

The good news is that we can define a non-const and a const version of operator[] separately. The non-const version will be used with non-const objects, and the const version with const-objects.

If we comment out the line clist[2] = 3, the above program compiles and executes as expected.

Error checking

One other advantage of overloading the subscript operator is that we can make it safer than accessing arrays directly. Normally, when accessing arrays, the subscript operator does not check whether the index is valid. For example, the compiler will not complain about the following code:

However, if we know the size of our array, we can make our overloaded subscript operator check to ensure the index is within bounds:

In the above example, we have used the assert() function (included in the cassert header) to make sure our index is valid. If the expression inside the assert evaluates to false (which means the user passed in an invalid index), the program will terminate with an error message, which is much better than the alternative (corrupting memory). This is probably the most common method of doing error checking of this sort.

Pointers to objects and overloaded operator[] don’t mix

If you try to call operator[] on a pointer to an object, C++ will assume you’re trying to index an array of objects of that type.

Consider the following example:

Because we can’t assign an integer to an IntList, this won’t compile. However, if assigning an integer was valid, this would compile and run, with undefined results.

Rule: Make sure you’re not trying to call an overloaded operator[] on a pointer to an object.

The proper syntax would be to dereference the pointer first (making sure to use parenthesis since operator[] has higher precedence than operator*), then call operator[]:

This is ugly and error prone. Better yet, don’t set pointers to your objects if you don’t have to.

The function parameter does not need to be an integer

As mentioned above, C++ passes what the user types between the hard braces as an argument to the overloaded function. In most cases, this will be an integer value. However, this is not required -- and in fact, you can define that your overloaded operator[] take a value of any type you desire. You could define your overloaded operator[] to take a double, a std::string, or whatever else you like.

As a ridiculous example, just so you can see that it works:

As you would expect, this prints:

Hello, world!

Overloading operator[] to take a std::string parameter can be useful when writing certain kinds of classes, such as those that use words as indices.

Conclusion

The subscript operator is typically overloaded to provide direct access to individual elements from an array (or other similar structure) contained within a class. Because strings are often implemented as arrays of characters, operator[] is often implemented in string classes to allow the user to access a single character of the string.

Quiz time

1) A map is a class that stores elements as a key-value pair. The key must be unique, and is used to access the associated pair. In this quiz, we’re going to write an application that lets us assign grades to students by name, using a simple map class. The student’s name will be the key, and the grade (as a char) will be the value.

1a) First, write a struct named StudentGrade that contains the student’s name (as a std::string) and grade (as a char).

Show Solution

1b) Add a class named GradeMap that contains a std::vector of StudentGrade named m_map. Add a default constructor that does nothing.

Show Solution

1c) Write an overloaded operator[] for this class. This function should take a std::string parameter, and return a reference to a char. In the body of the function, first iterate through the vector to see if the student’s name already exists (you can use a for-each loop for this). If the student exists, return a reference to the grade and you’re done. Otherwise, use the std::vector::push_back() function to add a StudentGrade for this new student. When you do this, std::vector will add a copy of your StudentGrade to itself (resizing if needed). Finally, we need to return a reference to the grade for the student we just added to the std::vector. We can access the student we just added using the std::vector::back() function.

The following program should run:

Show Solution

2) Extra credit #1: The GradeMap class and sample program we wrote is inefficient for many reasons. Describe one way that the GradeMap class could be improved.

Show Solution

3) Extra credit #2: Why doesn’t this program work as expected?

Show Solution

9.9 -- Overloading the parenthesis operator
Index
9.7 -- Overloading the increment and decrement operators

150 comments to 9.8 — Overloading the subscript operator

  • Gaurav Arya

    What would be the equivalent of the below shown loop, with iterators?

  • Andi

    "The subscript operator is one of the operators that must be overloaded as a member function. An overloaded operator[] function will always take one parameter: the subscript that the user places between the hard braces."
    The second sentence implies that operator[] is a unary operator which is not true. It is a binary operator taking the two parameters: reference to object and value.

    Because *this is always implicitly passed (and added as parameter) to member functions, this whole thing works.
    In general, I find the insights into how the compiler converts my code extremely helpful and very important to understand the functionality of OOP. I believe adding the information of why the subscript operator is overloaded as a member function and that it is a binary operator would be beneficial for this chapter.

    Andi

    • Hi Andi!

      As long as you're not working in assembly, the this-pointer is not counted as a parameter. The subscript operator takes one parameter. "operator[] shall be a non-static member function with exactly one parameter." N4762 § 11.5.5.
      I like your interest in low-level concepts, but including it in this lesson would be more confusing than helpful imo.

      • Andi

        Hi nascardriver,
        Thank you for your reply! I understand your point that you want to keep it simple.
        I agree that assembly would go a bit too far. I was not aware of N4762 § 11.5.5 and I thought this overload would be analogous to as we did it with operator+ in Chapter 9.4. Nevertheless, I am glad to be aware of the *this pointer and try to understand as much as possible of how it works. 🙂

  • 6ix9ine

    The following is about quiz time 1c).
    Why isn't it working with a const auto in the for each loop?
    My code:

    I thought when we're not changing m_map, we can make auto &student const

    • Hi!

      @student.studentGrade is not allowed to be modified, because @student is const. @student.studentGrade is returned by reference, but not by const reference, which means that the caller of @operator[] could modify @student.studentGrade, but that's illegal.
      Return by const reference or by value.

  • Vamshi malreddy

    what's wrong with this for overloading operator[] in Q 1.C of the quiz ??

    char& operator[](string str)
        {
            for(StudentGrade &student: m_map)
            {
                if(student.m_name == str )
                    return student.m_grade;
            }

            m_map.push_back(StudentGrade(str));
            return (*this)[str];
        }

    And also in it's solution shoudn't the GradeMap be the friend of StudentGrade for accessing private member grade while returning in the for each loop as ref.grade

    Thankyou!

    • Hi Vamish!

      > what's wrong with this
      * You're using "using namespace", this can cause name collisions
      * You're passing a string by value, this is slow
      * You're using recursion, this is slow
      * You're dereferencing @this rather than using operator->

      > shoudn't the GradeMap be the friend of StudentGrade for accessing private member grade
      @StudentGrade is a struct, all members are public by default. It looks like you're using a class, if that's the case it either needs to be friend or you need a getName function.

  • In the section Pointers to Objects and Overloaded[] don't mix, in the code

    Correct me if I'm wrong, but we have a dynamically allocated IntList with a pointer to it called list. Then below that, we attempt to update the index of 2 but you have to deference the pointer in order to update its value. So why is the error thinking we have to index of an array of IntLists rather than not dereferencing the pointer. Does dereferencing the pointer pass it as an object of IntLists and thus allows us to also update it, fixing both errors?

    • Alex

      list is a pointer to a single IntList object. When we do list[2], we're saying, "get the second object from the array this pointer points to", which doesn't exist.

      However, (*list)[2] will work as expected, as *list returns the object that list is pointing to (the single IntList object), and the [2] will invoke the overloaded operator[] to return the proper array element.

  • Olivier

    Hello, I think that a super small optimization could be used in the quiz. Instead of creating a temporary variable like this:

    We could use an anonymous variable:

    • nascardriver

      Hi Olivier!

      You could take it a step further by using @std::vector::emplace_back instead of @std::vector::push_back. @std::vector::push_back creates a copy of the element whereas @std::vector::emplace_back constructs the object inside the vector. This might be easier to understand after Chapter 15.

      References
      * std::vector::emplace_back - http://en.cppreference.com/w/cpp/container/vector/emplace_back

  • I saw that one user here, like me, wasn't clear at first why this is undefined behavior (since that main seems to print out ok things!). Nevertheless, this main makes it clearer why it's not working as expected, I think; feel free to improve it further though:

    Extra question:
    Is there a way to prevent this? nascardriver pointed out that you can fix this by obviously not doing this, so not returning a reference to an external reference. That makes sense, since std::vector has its underlying implementation that is totally out of our control, so we can't do anything about it.
    But even if I created a class that wrapped a dynamically allocated vector (I'd create my own std::vector; internally reallocating like new vector[size * 2] to double the size, copy over all elements to new array and delete[] old_vector) I'd still not be able to return a reference to the elements, since I can't guarantee for how long someone can hold on to the reference. So here you shouldn't return a reference either, since that could go dangling too, even if it's not an external reference anymore, I'm reallocating the memory.

    • nascardriver

      Hi Andrei!

      Having references/pointers to possibly temporary object is always problematic and I don't think there's a nice way of preventing problems. If you know that you're never going to store more than let's say 16 students you could use an std::array and won't have to worry about dangling references/pointers, because an std::array doesn't resize.
      Another solution is not storing the students themselves, but pointers to dynamically allocated students and returning the pointers instead of references to the pointers (Or do the same with std::reference_wrapper), that should be faster anyway.

  • nascardriver

    Hi Alex!

    Quiz 3
    I'd change

    to

    because it's the operator[] doing the push_back, not the assignment to the char.
    (Same with Frank)

  • Saumitra Kulkarni

    So to fix the dangling reference issue in Q.3) Extra credit #2: we can simply modify (char&) type to just (char) type.

    Am I correct ?

  • nascardriver

    Hi Alex!

    Solution to 1c is missing

    PS: I think it'd be better if the "Leave a Comment" div was above the comments

    • Alex

      Thanks, example fixed.

      PS: I don't necessarily disagree, but there are a few good reasons to leave it as it is, including having more users maybe read the existing comments before asking a question that's already been answered, and not modifying a template that will be overwritten the next time a template update is available...

  • Max

    See you again,Alex,now a new question in these code

    I attempt to use the keyword const to strict ref ,making it read-only.I think that just needing to know whether the name is in the m_map,but compiler tell me it's wrong:error: binding 'const char' to reference of type 'char&' discards qualifiers.
    But,why?

    • nascardriver

      Hi Max!

      If @StudentGrade::grade is const you need to change every usage of it to be const too.

  • Pointer

    nascardriver
    I'm not sure you got the point of my original post. It is not about when we should return const referemce or reference. It is about what is wrong with the last example in this tutorial. We cannot solve the problem by changing the operator overloading function to const. We can solve it by not doing what was done in that program - returning reference to external reference - because this is what caused dangling reference after the vector was resized.

    Also here is an example where external reference can recieve whole class object as reference (non-standart user defined object) and make changes to private data:

    Output:5

    • Pointer

      m_x should be private

  • Pointer

    Hi Alex,
    Is it correct to say that " class encapsulation can be by-passed by assignmet of returned reference from member function to pointer or reference"?

    It took me some time to figure out what is exacty happening in the last example.
    References can be initialized in two ways:
    [code]
    int x = 0;
    int& ref1 = x; //the first way-reference initialized to variable x
    int& ref2 = ref1 //the second way-reference initialized to another reference
    [code]

    When I applied this to classes I had interesting results.

    The first way:

    [code]
    #include <iostream>

    class Test
    {
    private:

        int m_x;

    public:

        Test (int x = 0) : m_x {x} {}

        int getX()
        {
            return m_x;
        }
    };

    int main()
    {
        Test a;
        int &ref = a.m_x; //error: int Test::m_x is private
        std::cout << a.getX() << "n";
    }
    [code]

    And now the second way:

    [code]
    #include <iostream>

    class Test
    {
    private:

        int m_x;

    public:

        Test (int x = 0) : m_x {x} {}

        int& getX()  //important difference - getX() now returns by reference
        {
            return m_x;
        }
    };
    int main()
    {
        Test a;              //m_x is initialized to 0 in the constructur
        int &ref = a.getX(); //getX() returns reference and int &ref is initialized with it

    //int &ref is local variable and not part of the class but now has access to the class private member m_x
    //because it "recieved" it as a returned refference from member function getX()

        ref = 5;             //ref accessed amd modified private member m_x to have value 5
        std::cout << a.getX() << "n";
    }
    [code]

    Output:
    5

    • nascardriver

      Hi Pointer!

      "Is it correct to say that " class encapsulation can be by-passed by assignmet of returned reference from member function to pointer or reference"."
      Yes, this is the reason why, when returning a reference to a private variable, you should usually return a const reference instead.

      So instead of

      You'd write

      This way you won't be able to assign a value to the reference returned by @getX.

      PS: You need to close ever code tag with a [/code] tag

      • Pointer

        Thanks for the replay nascardriver.
        I will use

        in the future.

        PS: Also int &ref is local reference not local variable but cannot edit anymore...

      • Pointer

        doesn't work because the return value of the function is used for assignment and modify private data.

        We cannot put const to every function because some functuins just have to return non-const to do their purpose.

        We better just avoid returning references from member functions to pointers or references.

        • nascardriver

          "We better just avoid returning references from member functions to pointers or references."
          No! Return a reference whenever the returned data type is of non-standard size (anything except the numeric types). Return a const reference unless you need a regular reference.
          Same goes for pointers.

    • Alex

      Just a quick note here, as nascardriver's answer is largely sufficient.

      What you're really getting at here is that the returning of a reference by getX() allows external parties direct access to m_x, which is violation of encapsulation.

      Note that we don't have to assign the return value of getX() to a reference or pointer -- it can be used directly!

  • Serge B

    Hi Alex,

    I wonder why there are two const, one at the beginning and one at the end of this:

    I can't find it in my notes and don't know where to look for. Could you explain it to me please?

    Thank you!

    • Alex

      There are three const. 🙂

      The const int& at the beginning means the function returns a reference to a const int.
      The const int as a function parameter means the function takes a const int parameter (by value)
      The trailing const means the member function is const -- that is, it won't change the value of any of the class members, or call any other non-const member functions.

  • Topherno

    Hi Alex,

    I'm really confused by what you mean in the very last paragraph:

    "When Frank is added, the std::vector must grow to hold it. This requires dynamically allocating a new block of memory, copying the elements in the array to that new block, and deleting the old block."

    In 7.10, specifically section "Stack behavior with std::vector", you described how push_back() adds another element to the vector object it's called on. And each push_back() doesn't delete any previously added elements. The example in that section clearly demonstrates that.
    So why in the example here would push_back("Frank") delete the Joe struct element?
    EDIT: For me the code runs as expected btw...I'm using clang with C++14.

    P.s. thanks so much to you and your team for creating and maintaining this tutorial, for me it's second to none on the web! I've been using it for many years now, for learning and as a reference 🙂

    • Alex

      When we call push_back(), we're appending an element to the end of the vector. If the vector has a larger capacity than the vector's size, there is room for the element, so no reallocation is necessary. We just add the element, and bump up the size by 1.

      However, if the vector's capacity is the same as it's size, then the vector is full, and we can't immediately add the new element. In that case, the vector will increase the capacity of the array so that there is room to add the new element. The process of changing capacity involves reallocation of dynamic memory, which results in the old values being copied to the new memory, and then the old memory (which is no longer needed) being deleted. Externally, you don't see this happen -- from the outside, it looks like you've just added an element to the existing array. But in reality, all of the elements are now living at different addresses (but they have the same values).

      To summarize, push_back() won't impact the values of the other elements, but it may cause a reallocation, which involves deleting memory that is no longer needed after the elements are moved to a larger chunk of memory.

      • Topherno

        Thanks for the reply.

        So does this mean that in general, if a reallocation occurs, it will always cause pointers/references to the old values to become dangling? If so, resizing a std::vector (or doing anything else that will cause a reallocation) sounds kinda dangerous, even bug-prone. Is the lesson here that we should always allocate enough capacity up front via reserve(), or rather that we shouldn't be working with pointers/references to elements like we do in 3) above?

        Thanks again Alex 🙂

        • Alex

          > So does this mean that in general, if a reallocation occurs, it will always cause pointers/references to the old values to become dangling?

          Yes. And you do have to be careful, because that reallocation will invalidate any external pointers or references. Quiz question #3 was created precisely to share some seemingly innocent code that runs afoul of this issue.

          There are many ways to work around this: allocate capacity up front if you can (but if you're going to do this, use a std::array instead), and don't keep pointers or references around (instead, keep track of indices, as they are reallocation proof).

          • Topherno

            Ok, so one should definitely avoid pointers/references to std::vector elements if possible, unless for some reason there's enough capacity upfront!

            Thanks again Alex 🙂

  • John Halfyard

    If the length of m_list is 10, shouldn't the index assert section of the class read "index <= 10"

    • Alex

      No, and this one of the biggest causes of programming errors.

      An array with length 10 has elements with indices 0 through 9. Therefore, if the user passes in index 10, this should be outside of the valid set of indices. Therefore, we should use less than, not less-than-equals.

  • hey Alex.
    can we initialize a variable of type "struct" (any struct) using a constructor of a class?

    Consider this code

    #include<iostream>
    using namespace std;
    struct A
    {
        int a;
        char ch;
    };

    class B
    {
         A m_i;
    public:
        //B(int x, char ch):(how do we initialize ,say m_i.a?)
        int getVal()
        {
            return m_i.a;
        }
        char getChar()
        {
            return m_i.ch;
        }
    };
    int main()
    {
        B b;
    }

    Is there any way we can initialize ,say m_i.a, using a constructor.

  • yayafuture

    There is a typo, in the second program,
    return m_list[nIndex];
    should be
    return m_list[index];

  • AMG

    Alex,
    I agree with you on "Extra credit #2", but on my Xcode it works as expected. I added another person to your example, and it still works. It seems dangling reference may or may not fail, and hence, may not easy to catch.
    Thanks.

    • Alex

      Yes, dangling references can be a pain to debug. They may work fine depending on how the compiler lays things out in memory, and then you'll do something seemingly unrelated that causes the compiler to change how it does memory layout and suddenly something that appeared to be working no longer is. That can be quite difficult to find.

  • David

    Why does this work?

    Using Visual Studio v15.3.3

  • Dani

    Hi Alex,
    int main()
    {
        GradeMap grades;
        char& gradeJoe = grades["Joe"];
        gradeJoe = 'A';
        char& gradeFrank = grades["Frank"];
        gradeFrank = 'B';
        std::cout << "Joe has a grade of " << gradeJoe << 'n';
        std::cout << "Frank has a grade of " << gradeFrank << 'n';

        return 0;
    }

    why do you say this don't work as expect ?
    i think this print as expect. i try to compile and it print :
    Joe has a grade of A
    Frank has a grade of B

    ....Thanks for your answer

  • Martin

    ...(in this case, 2) an argument to the function.

    should be

    ...(in this case, 2) as an argument to the function.

  • Martin

    There is a spelling mistake:

    This allows us to both get and set values of m_anList directly

  • Rohit

    Hi Alex!

    StudentGrade temp { name, ‘ ‘ };

        // otherwise add this to the end of our vector
        m_map.push_back(temp);

        // and return the element
        return m_map.back().grade;
    }

    In the above code according to me first an empty grade is stored in the temp obj and then through returning by reference this ' ' empty value(char) is replaced by the original grade. Is this correct or there is something else happening here?

    • Alex

      First an empty grade is created. Then it's added to the std::vector (if necessary). Then the proper element is returned by reference, so that the caller can set the grade itself.

  • DenisKa

    It was bugging me why does operator[] return reference and you cleared it once again 🙂

    -Thanks Alex

  • Ayush Goel

    Hey Alex, Actually I'm having a problem understanding the output generated by my code. I used << operator overloading and ++ -- overloading, could you please help me out?

    The output was supposed to be (as I was expecting it to be)
    11,20
    12,21
    12,21
    13,22

    instead, it shows
    13,22
    13,22
    11,20
    13,22

    • Alex

      As I have said many times (and will say again): A variable with side effects applied should not be used more than once in a given statement, otherwise the results may be indeterminate. You use p1 four times in a single statement, and have side effect applied twice (once with the pre-increment and once with the post-increment).

  • sebastian

    Hello! i have one question, when i do this:

    class Class obj; i know that i created an object from the class Class called 'obj'. But when i do this:

    class *obj = new Class;    what is the name of the object in the heap?? because this is not an anonymous object, is obj?? and the object itself lived in the heap and the pointer live in the stack?? i'm a little confused ...  I asked this because i try to undertand the codes making a little diagrams of the objects  and so on.

    Best regards.

  • Chris

    Is there a reason why this works:

    But this does not?:

    It throws the following error:
    'return': cannot convert from 'const char' to 'char &'

    • Alex

      I can't speak to what situations "for each" works/doesn't work in, as it's not part of the C++ standard. It's something that Microsoft made up, and you should not use it.

  • Ryan

    In one of your first code examples on this page, you have the following:

    Is there any reason you did not make index a reference parameter, rather than having the function copy it in? Would it not be better for performance to always make any constant parameters referenced?

    Thanks!

  • Matt

    In the quiz section(1C), you wrote:
    "This class should take a std::string parameter, and return a reference to a char."

    Did you mean to write "function" instead of "class"?

Leave a Comment

Put all code inside code tags: [code]your code here[/code]