Search

9.15 — Shallow vs. deep copying

Shallow copying

Because C++ does not know much about your class, the default copy constructor and default assignment operators it provides use a copying method known as a memberwise copy (also known as a shallow copy). This means that C++ copies each member of the class individually (using the assignment operator for overloaded operator=, and direct initialization for the copy constructor). When classes are simple (e.g. do not contain any dynamically allocated memory), this works very well.

For example, let’s take a look at our Fraction class:

The default copy constructor and assignment operator provided by the compiler for this class look something like this:

Note that because these default versions work just fine for copying this class, there’s really no reason to write our own version of these functions in this case.

However, when designing classes that handle dynamically allocated memory, memberwise (shallow) copying can get us in a lot of trouble! This is because shallow copies of a pointer just copy the address of the pointer -- it does not allocate any memory or copy the contents being pointed to!

Let’s take a look at an example of this:

The above is a simple string class that allocates memory to hold a string that we pass in. Note that we have not defined a copy constructor or overloaded assignment operator. Consequently, C++ will provide a default copy constructor and default assignment operator that do a shallow copy. The copy constructor will look something like this:

Note that m_data is just a shallow pointer copy of source.m_data, meaning they now both point to the same thing.

Now, consider the following snippet of code:

While this code looks harmless enough, it contains an insidious problem that will cause the program to crash! Can you spot it? Don’t worry if you can’t, it’s rather subtle.

Let’s break down this example line by line:

This line is harmless enough. This calls the MyString constructor, which allocates some memory, sets hello.m_data to point to it, and then copies the string “Hello, world!” into it.

This line seems harmless enough as well, but it’s actually the source of our problem! When this line is evaluated, C++ will use the default copy constructor (because we haven’t provided our own). This copy constructor will do a shallow copy, initializing copy.m_data to the same address of hello.m_data. As a result, copy.m_data and hello.m_data are now both pointing to the same piece of memory!

When copy goes out of scope, the MyString destructor is called on copy. The destructor deletes the dynamically allocated memory that both copy.m_data and hello.m_data are pointing to! Consequently, by deleting copy, we’ve also (inadvertently) affected hello. Variable copy then gets destroyed, but hello.m_data is left pointing to the deleted (invalid) memory!

Now you can see why this program has undefined behavior. We deleted the string that hello was pointing to, and now we are trying to print the value of memory that is no longer allocated.

The root of this problem is the shallow copy done by the copy constructor -- doing a shallow copy on pointer values in a copy constructor or overloaded assignment operator is almost always asking for trouble.

Deep copying

One answer to this problem is to do a deep copy on any non-null pointers being copied. A deep copy allocates memory for the copy and then copies the actual value, so that the copy lives in distinct memory from the source. This way, the copy and source are distinct and will not affect each other in any way. Doing deep copies requires that we write our own copy constructors and overloaded assignment operators.

Let’s go ahead and show how this is done for our MyString class:

As you can see, this is quite a bit more involved than a simple shallow copy! First, we have to check to make sure source even has a string (line 8). If it does, then we allocate enough memory to hold a copy of that string (line 11). Finally, we have to manually copy the string (lines 14 and 15).

Now let’s do the overloaded assignment operator. The overloaded assignment operator is slightly trickier:

Note that our assignment operator is very similar to our copy constructor, but there are three major differences:

  • We added a self-assignment check.
  • We return *this so we can chain the assignment operator.
  • We need to explicitly deallocate any value that the string is already holding (so we don’t have a memory leak when m_data is reallocated later).

When the overloaded assignment operator is called, the item being assigned to may already contain a previous value, which we need to make sure we clean up before we assign memory for new values. For non-dynamically allocated variables (which are a fixed size), we don’t have to bother because the new value just overwrite the old one. However, for dynamically allocated variables, we need to explicitly deallocate any old memory before we allocate any new memory. If we don’t, the code will not crash, but we will have a memory leak that will eat away our free memory every time we do an assignment!

A better solution

Classes in the standard library that deal with dynamic memory, such as std::string and std::vector, handle all of their memory management, and have overloaded copy constructors and assignment operators that do proper deep copying. So instead of doing your own memory management, you can just initialize or assign them like normal fundamental variables! That makes these classes simpler to use, less error-prone, and you don’t have to spend time writing your own overloaded functions!

Summary

  • The default copy constructor and default assignment operators do shallow copies, which is fine for classes that contain no dynamically allocated variables.
  • Classes with dynamically allocated variables need to have a copy constructor and assignment operator that do a deep copy.
  • Favor using classes in the standard library over doing your own memory management.
9.x -- Chapter 9 comprehensive quiz
Index
9.14 -- Overloading the assignment operator

75 comments to 9.15 — Shallow vs. deep copying

  • Winston Lai

    Hi, I was wondering if you delete m_data right before you wanna copy source.m_data to the new m_data, wouldn't the content of source.m_data disappear since you deallocate it before you try to copy the content to the newly created m_data?

    • Hi Winston!

      > source.m_data disappear since you deallocate it
      @source.m_data is not being deleted, only @this->m_data is being deleted.

      • Winston Lai

        Hi nascardriver,

        Are you saying that since the pointer m_data is not being deleted, only the content(memory) the pointer m_data is pointing is deallocated. Then since the memory the pointer "m_data" is point to is deallocated, when you call if (source.m_data), the pointer itself is a dangling pointer because you have deallocated the memory it was used to point to? so if(source.m_data)= false and the loop will never execute. Or should I put it this way, is the "m_data" in line 9 referring to the member variable of "copy" or "hello" that passed in as a source for assignment "operator="?

        • > is the "m_data" in line 9 referring to the member variable of "copy" or "hello" that passed in as a source for assignment "operator="?
          @copy and @hello aren't member variables, they are regular variables.
          Whenever you use member variables inside a class without specifying an object, you're accessing the local member variables, @this->m_data is being accessed.

  • From what it sounds like, you're creating the texture in the asteroid constructor. Not knowing your game I'd say it's unnecessary, because all asteroids (at least when they're the same type (1, 2, 3, 4)) look the same, so you shouldn't need more than 4 asteroid textures in memory. Anyway, if you're creating the texture in the constructor and you don't have a copy constructor you'll try to delete the same memory multiple times, which causes a crash. You need a copy constructor to clone the texture and not just copy the pointer, I don't know if d3d supports texture cloning.
    Please share the code of your copy constructor and the exact error message.
    Try fixing the current problem before changing to a 4 texture system so you can solve the problem yourself if it should ever come up again.

    • FelixPhill

      I've managed to create 2 copy constructors:

      That one makes the asteroids teleport and behave erratically, and:

      That one crashes the game after the first asteroid appears. When I don't provide a constructor it works normally.

      So what I should do is create the 4 textures at the start and use them for the asteroid constructors?

      • 1.

        You're copying pointers, that's the problem with default generated copy constructors, that's what you're trying to avoid.

        2.
        Line 21: You're using m_path, but you're not copying m_path

        > So what I should do is create the 4 textures at the start and use them for the asteroid constructors?
        Only after you fixed your problem.

  • FelixPhill

    Hello Alex and Nascardriver!
    Thank you for the great tutorial, I've learned a lot in the past month.
    I have a question regarding a game I wrote. It's a game in which you have a spaceship and have to avoid and shoot down asteroids.
    A new asteroid appears 200ms and they are stored in a vector called asteroids. When a new asteroid appears, the Sprite constructor is called and the asteroid is pushed back.

    And when an asteroid is destroyed it is erased:

    The problem is that every asteroid creates IDirect3DTexture9* p_Texture which needs to be Released() when it is erased from the vector. But when I write the destructor the program crashes. I can see the memory ramping up in task manager too the longer the game goes. I've tried writing a copy constructor but the program says that there is no copy constructor available.

    Here is the constructor:

  • jamal

    Hi Alex! In the 2nd code example (class Fraction), line 24, using the namespace of class Fraction for the definition of operator= is not necessary since it's all inside the class, or did I miss something?
    P.s.: Thanks alot for these incredibly great tutorials!

  • João Gueifão

    Hi again Alex, I have a suggestion:

    Only after having presented the implementation for the assignment operator
    MyString& MyString::operator=(const MyString & source) {...}

    perhaps you could suggest readers to use a common private method named
    void deepCopy(const MyString & source)

    which would contain the common code shared among the assignment operator and the copy constructor, in order to avoid duplicate code?

    What do you think? Regards.

  • Devanshu

    shallow copy is done by bitwise copying not by member copying
    Member copying is used by deep copy

    but You have mentioned memeber copying is done by shallow copy -- 2nd line

  • LoWesT

    Hey just to be sure with the line "Favor using classes in the standard library over doing your own memory management." you ment using std:string and/or std::vector instead of a char array/pointer right?

    btw thx for this awesome "guide/tutorial"

    LoWesT

  • Omri

    Consider replacing:
    // copy gets destroyed here
    with something of the sort:
    // exiting code block here, causes copy to be destroyed here and with it hello.m_data to be deallocated damaging hello.

    It is very well explained further on but at its location the original remark is not easy to understand by the "unseasoned".

  • S

    I think in cpp we don't have shallow copies by default  is all deep copy because of that
    now we have move constructor

    • Alex

      Incorrect.

      1) You still have shallow copies by default -- the MyString class above proves it.
      2) The move constructor has nothing to with deep copies -- the move constructor avoids making a copy in the first place!
      3) Move semantics can only be invoked if the source object is an r-value. When we're defining a copy constructor or a copy assignment operator, the source is an l-value. So we can't use move semantics here.

      • S

        we have misunderstanding here,
        move constructor comes to stopping deep copy
        in other word move constructor does shallow copy
        lets have an  ex
        about how shallow copy works (in python and cpp)
        note =every data struct by default in cpp is on stack

        is pretty obvious  addresses arenot same
        so we dont have shallow by default in c++
        there is a solution :use pointers to have this concept in your own code without dealing with heap

  • Satwant

    Deep copying second paragraph.

    Let’s go ahead and show how this is done for our MyString class: ❌
            
    Let’s go ahead and see how this is done for our MyString class:   ✔

  • john

    I think cHello.m_data should be hello.m_data

  • Himanshu

    Person *p = new Person(20);
    Person *q = new Person(10);

        p = q;

        delete q;
        q = NULL;

        p->Display();

    Where i am displaying data member value inside Display() func.

    In this scenario, Deep copy is also required.
    Alex, can you please clarify ?

    • Alex

      I'm not sure I understand what you're asking. Can you clarify?

      • Himanshu

        I am printing data member value (age) in display function (p->Display(); ) and it should print 10.

        But getting 0 which is default in constructor (seems data corrupted).

        I believe we need to write our own assignment operator in this case.
        If yes what would be functionality inside assignment operator since data member (age) is variable not pointer.

        Alex, plz clarify?

        • Alex

          I'm still confused. You have two Person objects, p and q. You then set pointer p to point at q (which means the original Person(20) is now leaked memory). You then delete q, which means both p and q are dangling pointers. You set q to NULL, but you do not set p to NULL, and then you try and call member function Display() using p, which is still a dangling pointer.

          Instead of "p = q", did you mean "*p = *q"?

  • Oeleo

    A round bracket is missing at the second line : "as a shallow copy)". And thanks for your lessons !

  • hughwang

    using default copy constructor should be as follows,

  • Alex

    Variable copy is a local variable, so it goes out of scope at the end of the block it's declared in. When copy gets deallocated, the ~MyString destructor will be run on copy, deleting copy.m_data.

    This means hello.m_data (which was pointing to the same memory address) is now a dangling pointer.

  • The MyString code example that is supposed to produce bad output will not compile because the
    source.m_data[i]; should actually read
    source[i];

    • Alex

      Fixed! Thanks for pointing that out.

      • michael

        MyString copy = hello; // use default copy constructor
            } // copy gets destroyed here

        Alex, can you elaborate to me that how is copy.mdata goes out of scope and is destroyed ?isn't it just a copy of hello and points to the same memory adress?

  • umair

    Can you explain these lines(13-15) in your program under deep copying

    shouldn't this be source.m_data[i]?

  • sitaram chhimpa

    There is error in first copy constructor example

    how can you assign m_numerator directly I think this is typo error.

  • Kattencrack Kledge

    Typo:

    In this part of the constructor of the MyString class:

    You accidentally wrote "\n" instead of "\0", which leads to garbage chars when printed out!

  • Mato

    In your examples we see copying of the member variable, what about of copying member functions?

  • Gopal

    Hi Alex,

    "However, for dynamically allocated variables, we need to explicitly deallocate any old memory before we allocate any new memory. If we don’t, the code will not crash, but we will have a memory leak that will eat away our free memory every time we do an assignment!"

    Can you explain how memory leak happens? Actually it should not as we are overwriting it.

    • Alex

      When we allocate dynamically allocated memory, we assign the address to a pointer so we can access that memory, right? We can also use that pointer to deallocate the memory. If the pointer holding the address of the dynamically allocated memory goes out of scope or gets assigned a different value, then we no longer have any way to deallocate that memory. The memory is thus lost (leaked) until the application closes and the operating system cleans up.

      Thus, we need to deallocate the memory before we assign a new value to the pointer.

  • Pramod Gaikwad

    Thank you very much Alex !!!! :)

    It is nice C++ tutorial and learning tool ever i have seen. Many OOP's concept get cleared so far.  much more to come ..........

  • Quang

    MyString cHello("Hello, world!");

    {
        MyString cCopy = cHello; // use default copy constructor
    } // cCopy goes out of scope here

    std::cout << cHello.GetString() << std::endl; // this will crash

    I dont understand why there is a block here and what is it use for?
    Thank you for reading Alex!

  • gulfam

    Very good help in learning OPP

  • KM

    Very nice explanation. Made many things clear. Thank you!

  • enoquick

    An interesting technical for operator=() is :

    T& T::operator=(const T& x) {
    T t(x); // copy constructor
    swap(t); // exception safe
    return *this;
    }

    void T::swap(T&)throw(); // exchange the single components -- no throw exception

  • really very good one and nicely descriptive Thanks

  • prc

    My question is, assuming we didn't use c style strings, and instead just used std::string (or for that matter, assumed we just use any objected in the heap that wasn't an array), would the proper way to deep copy it be this:

    or is there a better way that we should go about this?

    • Alex

      If you're using std::string you can just do a direct assignment:

      The overloaded assignment operator that std::string comes with will handle all of the memory management for you.

  • Phil Braun

    In the following example pulled from the article, there seems to be two problems that can occur.

    The first problem is the line "delete[] m_pchString;". If "m_pchString" is not allocated, could this cause an exception? Would it not be better to check if "m_pchString" is valid before attempting to delete the memory?

    The second problem occurs if "cSource.m_pchString" is a zero length string. Would it not be better to assign NULL to m_pchString if "cSource.m_pchString" has zero length? Oh, and what happens when an attempt is made to create a zero length string in the "new" statement and when "strncpy" attempts to copy a zero length string?

    Otherwise this code is a good learning tool.

    Phil

    • Alex

      if m_pchString is pointing to null, deleting it won't do anything (good or bad). If it's pointing to deallocated memory, then you've got bigger problems already.

      I added a null string check to the constructor, so it's no longer possible to even create a 0-length string. The smallest string you could create would be length 1 (just a null terminator).

  • sergk

    I'd like to note, to avoid problems with inherited classes, one have to make destructor and assignment operators to be virtual.

    -- serg.

Leave a Comment

Put all code inside code tags: [code]your code here[/code]