Search

15.1 — Intro to smart pointers and move semantics

Consider a function in which we dynamically allocate a value:

Although the above code seems fairly straightforward, it’s fairly easy to forget to deallocate ptr. Even if you do remember to delete ptr at the end of the function, there are a myriad of ways that ptr may not be deleted if the function exits early. This can happen via an early return:

or via a thrown exception:

In the above two programs, the early return or throw statement execute, causing the function to terminate without variable ptr being deleted. Consequently, the memory allocated for variable ptr is now leaked (and will be leaked again every time this function is called and returns early).

At heart, these kinds of issues occur because pointer variables have no inherent mechanism to clean up after themselves.

Smart pointer classes to the rescue?

One of the best things about classes is that they contain destructors that automatically get executed when an object of the class goes out of scope. So if you allocate (or acquire) memory in your constructor, you can deallocate it in your destructor, and be guaranteed that the memory will be deallocated when the class object is destroyed (regardless of whether it goes out of scope, gets explicitly deleted, etc…). This is at the heart of the RAII programming paradigm that we talked about in lesson 8.7 -- Destructors.

So can we use a class to help us manage and clean up our pointers? We can!

Consider a class whose sole job was to hold and “own” a pointer passed to it, and then deallocate that pointer when the class object went out of scope. As long as objects of that class were only created as local variables, we could guarantee that the class would properly go out of scope (regardless of when or how our functions terminate) and the owned pointer would get destroyed.

Here’s a first draft of the idea:

This program prints:

Resource acquired
Resource destroyed

Consider how this program and class work. First, we dynamically create a Resource, and pass it as a parameter to our templated Auto_ptr1 class. From that point forward, our Auto_ptr1 variable res owns that Resource object (Auto_ptr1 has a composition relationship with m_ptr). Because res is declared as a local variable and has block scope, it will go out of scope when the block ends, and be destroyed (no worries about forgetting to deallocate it). And because it is a class, when it is destroyed, the Auto_ptr1 destructor will be called. That destructor will ensure that the Resource pointer it is holding gets deleted!

As long as Auto_ptr1 is defined as a local variable (with automatic duration, hence the “Auto” part of the class name), the Resource will be guaranteed to be destroyed at the end of the block it is declared in, regardless of how the function terminates (even if it terminates early).

Such a class is called a smart pointer. A Smart pointer is a composition class that is designed to manage dynamically allocated memory and ensure that memory gets deleted when the smart pointer object goes out of scope. (Relatedly, built-in pointers are sometimes called “dumb pointers” because they can’t clean up after themselves).

Now let’s go back to our someFunction() example above, and show how a smart pointer class can solve our challenge:

If the user enters a non-zero integer, the above program will print:

Resource acquired
Hi!
Resource destroyed

If the user enters zero, the above program will terminate early, printing:

Resource acquired
Resource destroyed

Note that even in the case where the user enters zero and the function terminates early, the Resource is still properly deallocated.

Because the ptr variable is a local variable, ptr will be destroyed when the function terminates (regardless of how it terminates). And because the Auto_ptr1 destructor will clean up the Resource, we are assured that the Resource will be properly cleaned up.

A critical flaw

The Auto_ptr1 class has a critical flaw lurking behind some auto-generated code. Before reading further, see if you can identify what it is. We’ll wait…

(Hint: consider what parts of a class get auto-generated if you don’t supply them)

(Jeopardy music)

Okay, time’s up.

Rather than tell you, we’ll show you. Consider the following program:

This program prints:

Resource acquired
Resource destroyed
Resource destroyed

Very likely (but not necessarily) your program will crash at this point. See the problem now? Because we haven’t supplied a copy constructor or an assignment operator, C++ provides one for us. And the functions it provides do shallow copies. So when we initialize res2 with res1, both Auto_ptr1 variables are pointed at the same Resource. When res2 goes out of the scope, it deletes the resource, leaving res1 with a dangling pointer. When res1 goes to delete its (already deleted) Resource, crash!

You’d run into a similar problem with a function like this:

In this program, res1 will be copied by value into passByValue’s parameter res, leading to duplication of the Resource pointer. Crash!

So clearly this isn’t good. How can we address this?

Well, one thing we could do would be to explicitly define and delete the copy constructor and assignment operator, thereby preventing any copies from being made in the first place. That would prevent the pass by value case (which is good, we probably shouldn’t be passing these by value anyway).

But then how would we return an Auto_ptr1 from a function back to the caller?

We can’t return our Auto_ptr1 by reference, because the local Auto_ptr1 will be destroyed at the end of the function, and the caller will be left with a dangling reference. Return by address has the same problem. We could return pointer r by address, but then we might forget to delete r later, which is the whole point of using smart pointers in the first place. So that’s out. Returning the Auto_ptr1 by value is the only option that makes sense -- but then we end up with shallow copies, duplicated pointers, and crashes.

Another option would be to override the copy constructor and assignment operator to make deep copies. In this way, we’d at least guarantee to avoid duplicate pointers to the same object. But copying can be expensive (and may not be desirable or even possible), and we don’t want to make needless copies of objects just to return an Auto_ptr1 from a function. Plus assigning or initializing a dumb pointer doesn’t copy the object being pointed to, so why would we expect smart pointers to behave differently?

What do we do?

Move semantics

What if, instead of having our copy constructor and assignment operator copy the pointer (“copy semantics”), we instead transfer/move ownership of the pointer from the source to the destination object? This is the core idea behind move semantics. Move semantics means the class will transfer ownership of the object rather than making a copy.

Let’s update our Auto_ptr1 class to show how this can be done:

This program prints:

Resource acquired
res1 is not null
res2 is null
Ownership transferred
res1 is null
res2 is not null
Resource destroyed

Note that our overloaded operator= gave ownership of m_ptr from res1 to res2! Consequently, we don’t end up with duplicate copies of the pointer, and everything gets tidily cleaned up.

std::auto_ptr, and why to avoid it

Now would be an appropriate time to talk about std::auto_ptr. std::auto_ptr, introduced in C++98, was C++’s first attempt at a standardized smart pointer. std::auto_ptr opted to implement move semantics just like the Auto_ptr2 class does.

However, std::auto_ptr (and our Auto_ptr2 class) has a number of problems that makes using it dangerous.

First, because std::auto_ptr implements move semantics through the copy constructor and assignment operator, passing a std::auto_ptr by value to a function will cause your resource to get moved to the function parameter (and be destroyed at the end of the function when the function parameters goes out of scope). Then when you go to access your auto_ptr argument from the caller (not realizing it was transferred and deleted), you’re suddenly dereferencing a null pointer. Crash!

Second, std::auto_ptr always deletes its contents using non-array delete. This means auto_ptr won’t work correctly with dynamically allocated arrays, because it uses the wrong kind of deallocation. Worse, it won’t prevent you from passing it a dynamic array, which it will then mismanage, leading to memory leaks.

Finally, auto_ptr doesn’t play nice with a lot of the other classes in the standard library, including most of the containers and algorithms. This occurs because those standard library classes assume that when they copy an item, it actually makes a copy, not performs a move.

Because of the above mentioned shortcomings, std::auto_ptr has been deprecated in C++11, and it should not be used. In fact, std::auto_ptr is slated for complete removal from the standard library as part of C++17!

Rule: std::auto_ptr is deprecated and should not be used. (Use std::unique_ptr or std::shared_ptr instead)..

Moving forward

The core problem with the design of std::auto_ptr is that prior to C++11, the C++ language simply had no mechanism to differentiate “copy semantics” from “move semantics”. Overriding the copy semantics to implement move semantics leads to weird edge cases and inadvertent bugs. For example, you can write res1 = res2 and have no idea whether res2 will be changed or not!

Because of this, in C++11, the concept of “move” was formally defined, and “move semantics” were added to the language to properly differentiate copying from moving. Now that we’ve set the stage for why move semantics can be useful, we’ll explore the topic of move semantics throughout the rest of this chapter. We’ll also fix our Auto_ptr2 class using move semantics.

In C++11, std::auto_ptr been replaced by a bunch of other types of “move-aware” smart pointers: std::scoped_ptr, std::unique_ptr, std::weak_ptr, and std::shared_ptr. We’ll also explore the two most popular of these: unique_ptr (which is a direct replacement for auto_ptr) and shared_ptr.

15.2 -- R-value references
Index
14.x -- Chapter 14 comprehensive quiz

72 comments to 15.1 — Intro to smart pointers and move semantics

  • probably totally out of scope now but Visual Studio tells me that a managed class cannot contain a friend functions?  I was trying to use a friend function to override the << operator so how would I do this from a managed class?

  • Azad

    Hello Alex,

    You say //Auto_ptr1 has a composition relationship with m_ptr//
    Won’t it be aggregation as m_ptr is created outside class?

    • Alex

      No. It's okay for a composition to acquire it's part from elsewhere. The important part of the composition relationship is that the class manages the existence of the part (once acquired), and that's definitely the case here.

  • Naga Pushkal

    Hi Alex,

    When I am trying to use std::move() as shown below in your example to transfer ownership of res1 to res2, program is crashed. Can you please explain the reason? I am unable to figure out the reason myself.

    Auto_ptr1<Resource> res2(std::move(res1));

    • @Auto_ptr1 is using the default move constructor, which performs a move on each member.
      This means that after @res2 has been initialized as you did, @res1->m_ptr still points to the same object as before.
      When @res1 and @res2 go out of scope, they are both pointing to the same object, resulting in a double delete. Behavior is undefined. Chapter 15 will teach you how to write custom move constructors, allowing you to prevent this from happening.

  • Joe

    Hi Alex, I have a question:

    "A Smart pointer is a composition class." I don't understand what a composition class is. Do you even mean something in this sentence?

  • Trevor29

    Two typos in:
    "As long as Auto_ptr1 is defined as a local variable (with automatic duration, hence the “Auto” part of the class name), the Resource will be guaranteed to be destroyed at the end of block its declared in, regardless of how the function terminates (even if it terminates early)."
    should be "...at the end of the block it's declared in".
    (Missing "the" and a missing apostrophe, although writing "it is" might be better.)
    Trevor

  • boltzmann

    You say "Well, one thing we could do would be to delete the copy constructor and assignment operator, thereby preventing any copies from being made in the first place."
    But the code this comment applies to has no user defined copy constructor or assignment operator.  I can't find anything for the user to delete.

  • Topherno

    Hi Alex! Quick question regarding the last of these two lines:

    Why does passing res1 as a constructor parameter work? res1 is an object of type Auto_ptr1<Resource>, but the res2 constructor is expecting a pointer of type Resource (since res2's template parameter is Resource), not an object of type Auto_ptr1<Resource>.

    So is the res2 constructor doing some kind of conversion when res1 is passed to it and assigned to ptr, the latter of which is of type Resource*?

  • warchiefbinar

    Shouldn't we check in destructor if the pointer is not nullptr?
        ~Auto_ptr1()
        {
                    if (m_ptr)
            delete m_ptr;
        }

  • Rex Lucas

    Small edit:
    "Now let’s go back to our doSomething() example above,..."
    should be
    "Now let’s go back to our "someFunction()" example above,..."

  • Luhan

    At this part of the code

    1a)The overload of operator -> is to the ptr variable be able to get the member pointer m_ptr (which is declared as T* m_ptr)?

    1b)The other thing I'm confused is when you overload the operator*, is it for to be able to dereference the member pointer, and return the value at that address(e.g., to be able, for example, to access the sayHi() function below)?

    The code is to accomplish this:

    Ps:Sorry for asking this as the other people did already, just trying to make sure my train of thought is correctly.

    • Alex

      Remember that smart pointers are supposed to mimic dumb pointers in terms of usage. With a dumb pointer to a class, you can do *ptr to get the class object, and ptr->someFcn() to call some member function on the object that ptr is pointing to. These two overload allow you to do this with our smart pointer class by providing access the underlying object (m_ptr).

  • Hashem Omar

    hi Alex..
    this passage is confusing me :
    We can’t return by reference, because the local Auto_ptr1 we’ve created will be destroyed and the caller will be left with a dangling reference. And we don’t want to return by address, otherwise we might forget to delete it,".
    i dont understand here how can we even return by address where we can't return by reference? either ways Auto_ptr(r) will be destroyed and it'll be a dangling pointer/reference, is there something am not paying attention to in this example?

  • cngzhnp

    Hello Alex,

    Is it much more safer than check that this class member pointer is allocated or not?

    • Alex

      No, there's no reason to conditionalize your pointers before you delete them.

      Here's why: A pointer can essentially point to three different things
      * null
      * allocated dynamic memory
      * allocated non-dynamic memory (e.g. the address of a variable on the stack) or invalid memory (memory that has not been allocated for your app to use)

      Deleting a null pointer is fine (it doesn't do anything), so you don't need a conditional to protect against that.
      Deleting a pointer to allocated memory is fine (its what you intended), so you don't need a conditional to protect against that either.
      Deleting a pointer to either of the last cases is problematic no matter what. It has a valid memory address, so the if statement will let it pass, but deleting it will cause issues. Because the if statement won't protect you from this case, you don't need it.

  • Mohsen

    Hi Alex.
    i have two question.
    1. if we pass Auto_ptr2 by reference, why we should give a, a reference?

    why can't we just write like this: a==this  ?

    2. in the code above why you don't give '*' to this and few line after return *this instead?

    • Alex

      1) This is a standard self-assignment guard. We discuss why it's needed in lesson 9.14.
      2) (&a == this) compares the address of a and this to see if they are the same object in memory. If we did (a == *this), then we'd be comparing the value of a to see if it's the same as the value of *this. That would likely require an overloaded operator== for Auto_ptr2. That would probably work, but it's more work than necessary.
      We return *this later because the function is returning a reference. this is a pointer -- we have to dereference it to get the object it points at.

  • Jujinko

    Hey Alex,

    "A Smart pointer is a composition class that is designed to manage dynamically allocated memory (“dumb” pointers) and ensure that memory gets deleted when the smart pointer object goes out of scope."

    I don't get the "dumb pointer" part, or did you mean to write "dynamically allocated memory ( for “dumb” pointers)"?

    • Alex

      I was merely indicating that built-in pointers are sometimes called "dumb" pointers because they can't clean up after themselves. I've clarified the wording to make this clearer.

  • Curiosity

    Alex?
    I don't understand this thing ;
    "We can’t do it by reference, because the local Auto_ptr1 we’ve created will be destroyed and the caller will be left with a dangling reference. And we don’t want to do it by address, otherwise we might forget to delete it, which is the whole point of smart pointers in the first place! Pass by value is the only option that makes sense, but then we end up with shallow copies, duplicated pointers, and crashes."
    Here, in the first 2 paras, you are talking about returning by reference/addresses, But, Then you are saying PASS by value is the only option. So, were you trying to say return by value, or, is it something else?

  • Hardik

    Alex?
    Here, when first res2 goes out of scope(Local vars are destroyed in the opposite order of defintion), it deletes the memory address it was pointing to, and the, when res1 gets destroyed,
    it tries to delete the memory address it points to, but that memory address is already deleted by res2, SO, WHY DOESN'T MY PROGRAM CRASH?

    • Alex

      Deleting already deleted memory results in undefined behavior. It won't necessarily crash, but it very well might. I've updated the lesson text accordingly.

      • Hardik

        Alex? In the lesson text, you have said that when res1 goes out of scope, it deletes the resource and then, res2 tries to delete a deleted ptr ! But, I do think that as local vars are destroyed in the opp. Order of creation, So, shouldn't it be that when res2 goes out of scope it deletes res1 and when res1 goes out scope it tries to delete a deleted ptr, due to which u defined behaviour is shown ! ( I HAVE OBSERVED THE SAME IN THE DEBUGGER TOO ! )
        Waiting For Your Reply..... 🙂

  • KnowMore

    Alex? I have a Doubt !
    We all know that,
    (*ptr).member is equivalent to ptr->member.
    But in this statement :-

    We are returning address of m_ptr, which is actually equivalent to (ptr).member (Which is Wrong !).
    I need Help.
    Thanks in Advance 🙂

    • KnowMore

      Thanks to the Comments section 🙂
      I saw your comment,
      "Operator-> is a weird operator -- the semantics are pretty counterintuitive. Remember that a->b is the same as (*a).b.

      Simplifying a bit, the easiest way to think about it is that it should return a pointer that can be dereferenced. C++ handles doing the dereference and dot access part of the equation.

      So for a->b, you return a pointer to a, and the compiler does the rest of (*a).b for you."
      Understood :). Thanks Alex !

    • Alex

      Yes, this overloaded operator is a little weird syntactically. It seems like it should return a reference instead of a pointer -- but the language specifies that it should return a pointer. The dereference is then done implicitly.

      I suspect this was done so that arrows can be chained together, with the dereference happening implicitly at the end, so we can do a->b->c. If a->b returned as a non-pointer, than b->c would fail.

  • C++ Learner

    this code is for copying the pointer, am I right? 🙂

  • C++ Learner

    what is the difference between

    ?

    • Alex

      delete deletes a dynamically allocated single value. delete[] deletes a dynamically allocated array.

      • C++ Learner

        thanks, what is the difference between reference and address?  

        We can’t do it by reference, because the local Auto_ptr1 we’ve created will be destroyed and the caller will be left with a dangling reference. And we don’t want to do it by address, otherwise we might forget to delete it, which is the whole point of smart pointers in the first place! Pass by value is the only option that makes sense, but then we end up with shallow copies, duplicated pointers, and crashes.

        • Alex

          We can't do _what_ by reference? If you're talking about returning a locally created Auto_ptr1 from a function, you're right -- we can't use return by reference or return by address because we'll return a dangling reference or pointer. Pass by value is the only way. But pass by value can be expensive, so that's part of the reason move semantics were created.

          • C++ Learner

            You wrote that big text that I mentioned 🙂 my question was what is the difference between address and reference? I think that they are the same in this case.
            Sorry for not so good English

            • HammerIsComing

              https://stackoverflow.com/questions/57483/what-are-the-differences-between-a-pointer-variable-and-a-reference-variable-in

              Here is the link for the difference between pointer and reference.

  • C++ Learner

    Hi, please can you say what this two lines mean I know that they are for returning m_data pointer, but why do you use two statements to return the pointer and also why the operators are parentheses?
    Thanks in advance!!!

    • Alex

      These are operator overloads of operator* (dereference) and operator-> (member selection), so we use an Auto_ptr1 like m_ptr in this regard.

      • Jujinko

        I'm confused, why does "T* operator->() const { return m_ptr; }" return the pointer, shouldn't "->" return a dereferenced pointer with a dot after it, like such: "x_ptr->something" equals "(*x_prt).something"

        Thanks in advance,
        Jujinko

        • Alex

          Because that's how the language defines it. 🙂 Not a very satisfying answer, I know. The dereference happens implicitly.

          As I noted in a comment below:
          "I suspect this was done so that arrows can be chained together, with the dereference happening implicitly at the end, so we can do a->b->c. If a->b returned as a non-pointer, than b->c would fail."

          • Sihoo

            thank you so much for this explanation. I had same question and now it's clear! >.<

          • Fanchen

            In fact, if chaining is needed for operator->, only the last chain's operator-> should return a raw pointer, while the other chains' operator-> should return the next chain's object itself or its reference. For example:

            This snippet allows for chaining d->c->b->a->foo(). What happens under the hood is:

            *( d.operator->().operator->().operator->() ).foo()

            in which the dereference and dot operation on the last raw pointer is automatically performed by the compiler.

            If the operator-> for intermediate chain, say D, returns raw pointer (i.e. D's operator-> returns a pointer to C), the compiler would do the dereference and dot operation on C to find foo(). This would of course fail.

  • Mirza Safaraz

    hi Alex, what is different between

    • Alex

      This is just overloading the smart pointer's dereference and member selection operators so we can use our smart pointer as if it were the actual object.

  • skywalker007

    How does your smart pointer implementation work in case of pass by value? You have mentioned that a demerit of auto_ptr is that it transfers the ownership of the pointer to the function argument which gets destroyed at the end of function call. I think this issue also applies to your version of smart pointer implementation. Can you let me know how your version of smart pointer implementation avoid this?

    • Alex

      Yes, Auto_ptr2 has the same flaw as std::auto_ptr. In future lessons in this chapter, we show how to address this issue using move semantics. So... keep reading. 🙂

  • Jiaan Qi

    Hello Alex:
    I'm having trouble understanding the overload of member access operator "->" in your example.
    If the return type of the operator function is just a pointer, then shouldn't we use, for example, (ptr->)->sayHi() ?

    • Alex

      Operator-> is a weird operator -- the semantics are pretty counterintuitive. Remember that a->b is the same as (*a).b.

      Simplifying a bit, the easiest way to think about it is that it should return a pointer that can be dereferenced. C++ handles doing the dereference and dot access part of the equation.

      So for a->b, you return a pointer to a, and the compiler does the rest of (*a).b for you.

  • JoePerkins

    "(Auto_ptr1 has an composition relationship with m_ptr)" should be "(Auto_ptr1 has a composition relationship with m_ptr)".

    Looking forward to the unique_ptr lesson!

  • Daniel

    Typo? First "so" in "Plus assigning or copying a dumb pointer doesn’t copy the object being pointed so, so" should be "to"?

  • Mauricio Mirabetti

    Dear Alex, a few typos:

    Consequently, the memory for allocated for -> one "for" too many
    A Smart pointer is an composition class -> is a...
    leaving res2 with a hanging pointer. -> dangling pointer
    res1 not null -> res1 is not null
    passing a std::auto ptr by value  -> std::auto_ptr
    leaking to memory leaks. -> leading to

    Nice work! Best regards.

    Mauricio

  • aca4life

    Hi Alex, three minor issues caught my attention:

    1) In the beginning you showed how using dump pointers can lead to memory leakage. You therefore defined two someFuntion() functions. Within the bodies of these functions there is an if-statement:

    I assume that this function is meant to be a different function and should therefore be called someOtherFunction() or anything like that. [someFuntion is void, and it would be an infinite recursion even if it wasn't]

    2) In the Auto_ptr1 constructor the default parameters have to be assigned by operator '=' not '==' (Auto_ptr2 is correct though)

    3) Is there any specific reason you mix NULL (Auto_ptr2 move semantics) with nullptr? As I understood using nullptr all the time should be preferred.

    keep up the good work! 😉

    • Alex

      1) Yes, my mistake. I've removed the inadvertently-recursive call to someFunction() and replaced it with a user-input variable.
      2) Fixed, thanks for pointing this out.
      3) No reason other than bad habits. 🙂 I've replaced NULL with nullptr.

      Thanks for the feedback!

  • Bobix Louis

    Hi Alex,
       I would say this is the best tutorial I have ever came across.The chapters are short,simple and very informative.Thanks for these nice tutorials.

       I need a small clarification regarding the statement ' A Smart pointer is an aggregation class that is designed to.... '. Do you mean a smart pointer class has the properties of an aggregation(ie. Has-a relationship with the object pointer it owns)? I read in one of the earlier chapter about aggregation that an aggregation will have the following properties, but i am not able to relate those properties with the smart pointer class. Could you please help me out here a bit?

    The part (member) is part of the object (class)  
    The part (member) can belong to more than one object (class) at a time   --> I believe with move semantics(ie moving the ownership from one smart pointer to another) at a time only one smart pointer can own the object pointer.
    The part (member) does not have its existence managed by the object (class) --> I believe even though smart pointer is not responsible for creating the object pointer, it is responsible for deleting the object.
    The part (member) does not know about the existence of the object (class)

    • Alex

      Yes, you are correct. Although the smart pointer owns the part, it fails the test that the part can belong to more than one object at a time. Therefore, it cannot be an aggregation.

      I think a composition is a better fit. The part cannot belong to more than one object at a time (but can be moved), and the part's existence is managed by the object.

      I've updated the lesson accordingly.

Leave a Comment

Put all code inside code tags: [code]your code here[/code]