13.16 — Shallow vs. deep copying

Shallow copying

Because C++ does not know much about your class, the default copy constructor and default assignment operators it provides use a copying method known as a memberwise copy (also known as a shallow copy). This means that C++ copies each member of the class individually (using the assignment operator for overloaded operator=, and direct initialization for the copy constructor). When classes are simple (e.g. do not contain any dynamically allocated memory), this works very well.

For example, let’s take a look at our Fraction class:

The default copy constructor and assignment operator provided by the compiler for this class look something like this:

Note that because these default versions work just fine for copying this class, there’s really no reason to write our own version of these functions in this case.

However, when designing classes that handle dynamically allocated memory, memberwise (shallow) copying can get us in a lot of trouble! This is because shallow copies of a pointer just copy the address of the pointer -- it does not allocate any memory or copy the contents being pointed to!

Let’s take a look at an example of this:

The above is a simple string class that allocates memory to hold a string that we pass in. Note that we have not defined a copy constructor or overloaded assignment operator. Consequently, C++ will provide a default copy constructor and default assignment operator that do a shallow copy. The copy constructor will look something like this:

Note that m_data is just a shallow pointer copy of source.m_data, meaning they now both point to the same thing.

Now, consider the following snippet of code:

While this code looks harmless enough, it contains an insidious problem that will cause the program to crash! Can you spot it? Don’t worry if you can’t, it’s rather subtle.

Let’s break down this example line by line:

This line is harmless enough. This calls the MyString constructor, which allocates some memory, sets hello.m_data to point to it, and then copies the string “Hello, world!” into it.

This line seems harmless enough as well, but it’s actually the source of our problem! When this line is evaluated, C++ will use the default copy constructor (because we haven’t provided our own). This copy constructor will do a shallow copy, initializing copy.m_data to the same address of hello.m_data. As a result, copy.m_data and hello.m_data are now both pointing to the same piece of memory!

When copy goes out of scope, the MyString destructor is called on copy. The destructor deletes the dynamically allocated memory that both copy.m_data and hello.m_data are pointing to! Consequently, by deleting copy, we’ve also (inadvertently) affected hello. Variable copy then gets destroyed, but hello.m_data is left pointing to the deleted (invalid) memory!

Now you can see why this program has undefined behavior. We deleted the string that hello was pointing to, and now we are trying to print the value of memory that is no longer allocated.

The root of this problem is the shallow copy done by the copy constructor -- doing a shallow copy on pointer values in a copy constructor or overloaded assignment operator is almost always asking for trouble.

Deep copying

One answer to this problem is to do a deep copy on any non-null pointers being copied. A deep copy allocates memory for the copy and then copies the actual value, so that the copy lives in distinct memory from the source. This way, the copy and source are distinct and will not affect each other in any way. Doing deep copies requires that we write our own copy constructors and overloaded assignment operators.

Let’s go ahead and show how this is done for our MyString class:

As you can see, this is quite a bit more involved than a simple shallow copy! First, we have to check to make sure source even has a string (line 11). If it does, then we allocate enough memory to hold a copy of that string (line 14). Finally, we have to manually copy the string (lines 17 and 18).

Now let’s do the overloaded assignment operator. The overloaded assignment operator is slightly trickier:

Note that our assignment operator is very similar to our copy constructor, but there are three major differences:

  • We added a self-assignment check.
  • We return *this so we can chain the assignment operator.
  • We need to explicitly deallocate any value that the string is already holding (so we don’t have a memory leak when m_data is reallocated later).

When the overloaded assignment operator is called, the item being assigned to may already contain a previous value, which we need to make sure we clean up before we assign memory for new values. For non-dynamically allocated variables (which are a fixed size), we don’t have to bother because the new value just overwrite the old one. However, for dynamically allocated variables, we need to explicitly deallocate any old memory before we allocate any new memory. If we don’t, the code will not crash, but we will have a memory leak that will eat away our free memory every time we do an assignment!

A better solution

Classes in the standard library that deal with dynamic memory, such as std::string and std::vector, handle all of their memory management, and have overloaded copy constructors and assignment operators that do proper deep copying. So instead of doing your own memory management, you can just initialize or assign them like normal fundamental variables! That makes these classes simpler to use, less error-prone, and you don’t have to spend time writing your own overloaded functions!


  • The default copy constructor and default assignment operators do shallow copies, which is fine for classes that contain no dynamically allocated variables.
  • Classes with dynamically allocated variables need to have a copy constructor and assignment operator that do a deep copy.
  • Favor using classes in the standard library over doing your own memory management.

13.17 -- Overloading operators and function templates
13.15 -- Overloading the assignment operator

124 comments to 13.16 — Shallow vs. deep copying

  • Waldo Lemmer

    - `:` on next line:

    - type <&>/<*>identifier:

  • Rishi

    Just thought this could help you understand "copy and swap" idiom. Have a nice day ^-^

  • abel

    Please remove my first comment.
    NOTE: make a sense immediately nulling pointer.

    // assumes m_data is initialized
    void MyString::deepCopy(const MyString& source)
        // first we need to deallocate any value that this string is holding!
        delete[] m_data;
        m_data = nullptr;

        // because m_length is not a pointer, we can shallow copy it
        m_length = source.m_length;

        // m_data is a pointer, so we need to deep copy it if it is non-null
        if (source.m_data)
            // allocate memory for our copy
            m_data = new char[m_length];

            // do the copy
            for (int i{ 0 }; i < m_length; ++i)
                m_data[i] = source.m_data[i];

    • nascardriver

      Your code writes to `m_data` twice (Once to set it to `nullptr`, once again to set it to the new array). When you move `m_data = nullptr` to the `else` part, you're only writing to it once.

      • abel

        "Your code writes to `m_data` twice" -- It is normal practice. Please check it.

        1) You should to nulled a freeing memory (delete [] m_buf).
        2) You can to forget to nulled memory in condition "else"

        Be careful with code.

      • suman

        here, shallow copying is done using default copy constructor, when String str2 = str1;
        now pointer member of both objects point to the same memory location. when we do str2.change('geeksforgeeks'), we free the location, and then assign the pointer of str2 to new location, but how does pointer of str1 also point to that same location, shouldn't pointer of str1 be pointing to the same location earlier.

  • abel

    // assumes m_data is initialized
    void MyString::deepCopy(const MyString& source)
        // first we need to deallocate any value that this string is holding!
        delete[] m_data;

        // because m_length is not a pointer, we can shallow copy it
        m_length = source.m_length;
        m_data = nullptr;

        // m_data is a pointer, so we need to deep copy it if it is non-null
        if (source.m_data)
            // allocate memory for our copy
            m_data = new char[m_length];

            // do the copy
            for (int i{ 0 }; i < m_length; ++i)
                m_data[i] = source.m_data[i];

  • Robin

    I think the deepCopy method will cause unexpected behavior if you call MyString myString{"Something"}; myString.deepCopy(myString); because there is no self assignment-like brace! Unless the method is private so it can't be called (which is still not extremely safe) or perhaps the const parameter input is smart enough to not allow deleting itself indirectly, I am still learning so I don't know.

  • CC

    I think you can simplify the code of your assignment operator overload (DRY) from this

    to this

  • CC

    In your copy constructor, you initialise `m_data` with `nullptr`:

    Is there any particular reason why you do this? Is it because you want `m_data` to point to something when you call `delete`?

    • nascardriver

      This lesson didn't follow the always-initialize-everything recommendation. If the constructor didn't initialize `m_data`, `deepCopy` would try to `delete` an invalid pointer, which causes undefined behavior. (Deleting a `nullptr` is a no-op).

      I've updated the lesson to initialize all members at their declaration. Now the constructor doesn't have to do it manually anymore.

  • Yolo

    Hello guys. I tried to write the full program of this lesson and i have a question for it. Here is the code.

    1) Error C6386:Buffer overrun while writing to m_data: the writable size is m_length 1 bytes, but two bytes might be written. How to fix this error and what does it really mean?

    If i did something else wrong too, would be glad to hear it.

    • nascardriver

      - `getLength()` doesn't return the length of the string
      - Functions that don't modify a member should be `const`
      - Line 50 is always `true`

      1) C6386 is a warning. It means that `m_data` is an array with 1 element but you're accessing 2 elements. You're not doing that though, this warning is wrong.

      • Yolo

        I corrected those mistakes but it still gets me this error:
        "HEAP CORRUPTION DETECTED. CRT detected that the application wrote to memory after end of heap buffer."

        what should i do about it?

        • nascardriver

          You've got a problem in your code that causes this. I'm not going to tell you what it is, because your compiler should have told you. Either you didn't enable warnings, or you're ignoring them. Enable warnings, read them, and your mistake is obvious.

  • Tim

    Do we need the condition "if (source.m_data)" in the code below as we have assert(source) defined in the constructor? So, source can't be NULL.
    We also already set m_data=nullptr in deepCopy, why do we do that again in copy constructor?

  • Gamer_to_be

    Why can't we have the copy constructor format and used different member function like 'deepCopy'?
    I meant, can't we just have the following?When we have our own copy constructor, then the default one won't be called to do shallow copies.

  • ruchika malhotra

    In the above code if I use delete p to prevent memory leak I am getting

    *** Error in `./a.out': double free or corruption (out): 0x0000000000400890 ***

    Can you please explain why ?


  • Victor Keilhack

    Could we not just put the deepCopy()-contents into the operator= function and then use it instead of deepCopy() in the copy constructor?

    Interestingly, the compiler didn't complain about using the this-pointer in a constructor. So that means even constructors have the this pointer?

  • choofe

    Hi.I just want to be sure:
    In @deepCopy line 5 when you delete m_data, what is your intention?
    Is it because it contains garbage address and we want to initialize it first?
    or we want it to be nullptr?
    It shouldn't be to avoid self assigning troubles, we've done the check in the overload when calling @deepCopy from the =overload)!
    or it may cause some other trouble if we don't do so in some other cases?
    I've checked:
    without that line code runs fine.(I assume that even if it is garbage we will give it a proper address value)
    And for the sake of matter, I've replaced line 5 with

    and it also runs right.
    Is it safe to say that both assigning nullptr and deleting m_data are the same(in this context of course)? And also the nullptr assigning here won't cause memory leak, will it?

    • nascardriver

      `deepCopy` assumes that `m_data` is initialized, ie. either it's a `nullptr`, in which case `delete[]` has no effect, or it's pointing to dynamically allocated memory, so we need to `delete[]` it to avoid a leak.

  • kavin

    Under Deep copy example , line 5 why do you delete m_data which is a null pointer ? (being a null pointer it already contains nothing right ? )

  • sito

    in this lesson for the MyString class why do you have the char as const in the constructor? I tried without const and i got a convertion error.

  • inspectorPlatypus

    // self-assignment guard
        if (this == &fraction)
            return *this;

    Hey Alex, can you clarify what these lines are doing i have a bit of trouble understanding them.

    • nascardriver

      It checks if `fraction` is the same object as `this`. Not comparing the contents of the fractions, but comparing their addresses. This is covered in lesson 9.14.

  • mmp52


    in the below snippet of yours, you delete the m_data at 5th line of the MyString::deepCopy(..) function, but still when you define copy constructor you assign m_data(nullptr) at 26th line knowing that it will call deepCopy and be deleted , why is that?


    • All members should be initialized during construction.
      `delete[]` doesn't modify the pointer, it deletes the pointed-to object. Since `nullptr` doesn't point anywhere, `delete[]` has no effect.
      If `m_data` wasn't initialized, `delete[]` would try to delete an invalid pointer, causing undefined behavior.

  • Arthur

    Hi. Some typos after first code snippet showing deep copying.
    As you can see, this is quite a bit more involved than a simple shallow copy! First, we have to check to make sure source even has a string (line 8(must be 11)). If it does, then we allocate enough memory to hold a copy of that string (line 11(must be 14)). Finally, we have to manually copy the string (lines 14 and 15(must be 17 and 18)).

    • Ejamesr

      Alex, the typos Arthur pointed out still remain to be fixed (i.e., the line numbers referenced in your description need to be adjusted).

      And as so many others have said, thank you for such a finely detailed, logical, and orderly overview of C++!

  • DecSco

    Wouldn't it be preferable to put the deep copying logic into a function to avoid code duplication?

    • Alex

      Yes, though deepCopy should handle an already allocated m_data -- otherwise calling the function could cause the original m_data to be leaked. I've updated the lesson accordingly.

  • Paulo Filipe

    In this code snippet present above in this lesson:

    Shouldn't line 18 be:


  • Chandra Shekhar

    MyString::MyString(const MyString& source)
        // because m_length is not a pointer, we can shallow copy it
        m_length = source.m_length;

        // m_data is a pointer, so we need to deep copy it if it is non-null
        if (source.m_data)
            // allocate memory for our copy
            m_data = new char[m_length];

            // do the copy
            for (int i=0; i < m_length; ++i)
                m_data[i] = source.m_data[i];
            m_data = 0;
    In the above code for deep copy
    The   m_length = source.m_length should have been  m_length = source.m_length+1  to allocate one extra bit to store the bull character '\0' Right????

  • Richard

    Std::vector tends to be a little slow and std::array has some of the limitations of C-style arrays.  While C++ standard library provides tools of first choice, occasionally a class needs to perform its own dynamic memory management.  

    Here's the skeleton of a container class that encapsulates a dynamic array and defines its special member functions: an all-in-one illustration of tutorial concepts up to "Inheritance".  To keep it simple, I've substituted a global const MAX for a potential member variable, m_length.  The latter would be required for flexible-sized arrays.

    To demonstrate how the compiler handles the move constructor and move assignment class functions, I've inserted calls to a function printAp(), which display the addresses of selected objects.  Calls to printAp are bracketed in the class functions to mark them for later removal.  To force the creation of an anonymous (temporary) object, a member function reverse() returns an object of the class.  Main() serves as a test program as discussed below:

    Main does 2 things.  Superficially, it creates objects A, B, C, and D.  These objects encapsulate unsigned char arrays.  Main sets and retrieves values of the first or last elements of the arrays during the test of each class member function.  On a deeper level, main shows the behavior of the move constructor and move assignment member functions using the "print a pointer" function, printAp().  The console output on my machine using Visual Studio 2017 in DEBUG mode:

        Default Construc, 0

        Copy Construc, 1

        First Assign, 3

        Chain Assign, 3

        in reverse, Temp address = 0041F5E4
        in move construc, rhs address = 0041F5E4
        in move construc, this address = 0041F774
        Move Construc, 5

        in reverse, Temp address = 0041F5E4
        in move construc, rhs address = 0041F5E4
        in move construc, this address = 0041F66C
        in move assign, rhs address = 0041F66C
        in move assign, this address = 0041F78C
        Move Assign, 3

        object B address = 0041F78C
        object C address = 0041F780
        object D address = 0041F774

    This output is far simpler than it first appears.  There's text to show the purpose of each test, and to print an element of the encapsulated array to verify the result.  So far so dull.

    The rest is interesting.  The third-to-last block of text shows a call to my move constructor for the statement D = C.reverse().  What I want to do is create D and move the result of reverse() into it.  To accomplish this, the compiler calls my move constructor for D and hands me an r-value reference to the object "Temp", which I created in reverse() for a return value.  Smart!  Within the move constructor, "this" contains the address of D, and &rhs contains the address of "Temp".  As Temp is already constructed, all I really need to do is "capture" its data.  My move constructor accomplishes this task by first assigning Temp's m_ptr to D's, then assigning NULL to Temp's m_ptr (effectively marking the latter unusable upon its release to the heap).  Voila!  So far so good.

    But what's all this mess for B = D.reverse()?  B already exists, so what I want to do is move the result of reverse() directly to it.  The compiler does this in 2 steps.  First, it creates an anonymous object, then it calls my move constructor to capture Temp's data to it.  Just like it did for D above.  What?  Alright, we get it:  move constructor is named a "constructor" for a reason.  After capturing "Temp" as an anonymous object, the compiler calls my move assignment function, for the purpose of moving the contents of its anonymous object into my preexisting variable B.  Notice in the move assignment function, the anonymous object is passed as an r-value reference, and you can trace its address from "this" (in the move constructor) to &rhs (a parameter in move assignment).  Now, within move assignment, "this" points to my existing object, B.  The contents of object B will be replaced, so I delete[] its dynamic memory.  Then another rehash:  I "capture" the anonymous object's data by first assigning the memory ptr of the anonymous object to B's m_ptr, followed by assigning NULL to rhs.m_ptr (marking the latter as unusable upon its release to the heap).

    I'd prefer doing the B = D.reverse() in one move-assignment step.  Here's the output from the RELEASE build with optimization, skipping up to "Chain Assign":

        Chain Assign, 3

        in reverse, Temp address = 003AF9DC
        Move Construc, 5

        in reverse, Temp address = 003AF9D8
        in move assign, rhs address = 003AF9D8
        in move assign, this address = 003AF9E8
        Move Assign, 3

        object B address = 003AF9E8
        object C address = 003AF9E4
        object D address = 003AF9DC

    The optimized .exe fulfills my wish:  it calls my move assignment function with "this" pointing to B, while passing an r-value reference to "Temp" directly.  What about my D = C.reverse()?  My move constructor isn't even called!  D is simply given Temp's address.  Hurray for elision!

    The reason for looking "under-the-hood" of this process is to know what demands Array objects place on the heap.  reverse() passes its return "by-value", but it does not consume any significant heap storage beyond what is required to generate an object for the return (in my case, "Temp").  This understanding is important when MAX is expanded to the thousands.

    Alex makes this clear:  as a member variable of Array class, m_ptr keeps the code RAII compliant, so (hopefully) no dangling pointer runs amuck.  The big disadvantage of doing one's own memory management is that less-than-meticulous code (like not checking ranges) can corrupt the heap.

  • Asgar


    About this line in the first paragraphs:
    "... C++ copies each member of the class individually (using the assignment operator for overloaded operator=, and direct initialization for the copy constructor)."

    Did you mean to say "initialization" rather than "direct initialization"? As I understand, a copy constructor is invoked during any of the 3 kinds of initialization:
    1. copy initialization,
    2. direct initialization, and
    3. uniform initialization

    • Alex

      In this context, we're talking about copying the _members_ of a class. These members are initialized using direct initialization (as opposed to copy initialization).

  • Jon

    Hello again, I can't quite figure something out - for the following code snippet you wrote:

    If I delete the brackets on lines 4 and 6, the program compiles, but I get the runtime error: Test(50460,0x10039b380) malloc: *** error for object 0x100403700: pointer being freed was not allocated.

    I was under the impression that this would run OK because the variables would all stay in scope until the end of the program and because deleting a null pointer has no effect (is that not what this error message is pointing to?) but it looks like I'm missing something. It does print out "Hello World!" but something is not quite right...

    • Jon

      Think I've got it, I believe I conflated null pointer and uninitialized pointer?

      • All pointers in this example have been initialized, neither is a nullptr.
        Both @hello and @copy point to the same char array. Once one of @hello or @copy goes out of scope, the char array will be deleted. The other variable will still point to were the char array used to be. When the other variable goes out of scope, the char array will be deleted again, but it doesn't exist. Behavior is undefined.

  • Hi Alex!

    Third code block, line 17 should use @std::strlen

  • seriouslysupersonic


    1) Since

    aren't we sure that we will never be copying / assigning a non-null string (if the allocation was successful)?

    2) Shouldn't we assign @m_length after allocating new memory and checking @m_data is non-null?

    3) Is there a reason not to use strcpy() and use a for loop?

    • Hi!


      The order doesn't matter


      • seriouslysupersonic

        Thank you for the quick reply.

        1) If we do that, wouldn't the assertion fail and we wouldn't be able to copy / assign a null source.m_data field - because (unless memory allocation failed) no object with a null m_data field would be created? Shouldn't we check instead that new didn't fail?

        2) But if @source.m_data is null because the memory allocation of the source object failed, we are assigning whatever length the source believes it successfully allocated when the source object was created while we assign a nullptr to @m_data.

        • 1) There is not @source object, @source is a const char*. I might have misunderstood your question. Can you elaborate on (1)?
          > Shouldn't we check instead that new didn't fail?
          All we can do is check for nullptr. If allocation fails without the noexcept version of @new, an exception is thrown and this constructor won't be reached anyway.

          2) If @source failed to allocate memory, it will have thrown an exception and you shouldn't be able to use the source object anymore. But I agree with you, setting the length after verifying that there actually is data is better.

          • seriouslysupersonic

            1) I think I wasn't too clear in the beginning (sorry for my English). What I meant was: every MyString object seems to have a non-null m_data field because we either assert that in the const char* constructor or, as you just explained, if new fails, the constructor won't be reached anyway. If that's true, why do we then check

            in the copy constructor and assignment operator overload?

            2) This question doesn't make much sense because I forgot new would throw an exception and then we would have to handle that. Thanks for the explanation!

Leave a Comment

Put all code inside code tags: [code]your code here[/code]