C++, Programming

Be Explicit – Assignment Operators & Copy Constructors

C++ can have some interesting behaviour, sometimes expected and sometimes not so. It’s one of the reasons I enjoy using the language (maybe I like punishment) but it’s one of the reasons many people don’t.

One of the often overlooked aspects of C++ is the way classes can be easily, and incorrectly, copied without you knowing about it, due to the way classes will have copy and creation methods automatically generated for you if you don’t specify them yourself.

This post intends to look at how you can avoid this by making sure you are explicit when creating your C++ classes, through the use of copy constructors and assignment operators. To look at how these can be used, I’ll go through a possibly contrived example, which will show exactly where these members come into play and how not knowing how they work can generate behaviour you don’t expect, or don’t want.

A Vector Implementation
To show how and when these C++ features can be a help or a hindrance I’m going to use a simple example, creating an implementation of a basic vector container. While you might not do this yourself, it will easily show how we can use various member functions to make the object behave exactly how we want it to.

So to start with the basics, this is what our implementation of a vector (and probably most of you basic classes) looks like – and for the sake of simplicity, it won’t be getting more complicated either.

class MyVector
{
public:
  MyVector();
  ~MyVector();

private:
  int       m_entryCount;
  void*     m_entries;
};

So we have the most basic implementation here, a pointer to the vector elements which references dynamic heap memory (this is the member that’s going to be causing us problems) and an integer indicating how many elements the vector currently contains.

As the implementation continues, the vector gains member functions to add elements, return references to them, remove them, increase or decrease the capacity and so on.

Copying The Vector
But as people continue to use the container, they become more comfortable with them until eventually they do the following

MyVector masterVector;
MyVector localCopy;

// ... Add plenty of things to masterVector

// Copy a vector into our local copy so we
// have something to play with
localCopy = masterVector;

This all seems innocent enough – we are copying the one vector into another so we can use the contents and, we assume, modify them without affecting the original. After all, this is how built in types work.

But because our basic vector implementation hasn’t defined an assignment operator it’s not going to do what we expect.

But if we haven’t defined one, surely the compiler will generate a couple of errors because it doesn’t know what to do? Unfortunately in this case, the compiler tries to help you out, by creating a default assignment operator, which does nothing more than a shallow copy. And all a shallow copy does is copy the basic types, nothing more, nothing less.

In effect it does nothing more the following…

localCopy.m_entryCount = masterVector.m_entryCount;
localCopy.m_entries = masterVector.m_entries;

So when we do this, we end up with both m_entries pointers pointing to the exact same area of memory. And this is not a good thing. Especially if the user continues and does the following…

// Copy the vector into a local copy
// so we can alter it
localCopy = masterVector;

// Since the user would assume that
// they have an independent vector,
// they do what they want with it

// Lets clear the contents - which deletes
// all the allocated memory
localCopy.clear();

Since we have two objects pointing to the same area of memory, and one of them just called ‘delete’ on it, we are left in a very bad state. The master vector now references invalid memory and it thinks it actually contains elements, because its m_entryCount hasn’t been effected by the change to the copied vector…

So while in some cases it’s handy for an assignment operator to be automatically generated, in this case it certainly isn’t. And we should make sure that an assignment operator is only ever generated when we explicitly require it.

So the first thing to do is to actually generate an error when someone tries to use ‘=’ unless we specifically request it. And the way to do this is through a private declaration as part of the class.

So we end up with the following

class MyVector
{
public:
  MyVector();
  ~MyVector();

private:
  // Declare a private operator= so nobody can use it
  MyVector& operator= (const MyVector& rhs);
};

By putting the operator= method into the private part of the class, and not defining it, we stop anyone from using it, with the compiler generating an easy to understand and quick error.

Obviously in this case people want to be able to copy one vector to another, but now we have the ability to control how and when this happens. By simply moving the operator= method into the public space, and defining the function so it does a deep copy (in other words it copies the content of the allocated memory rather than just the pointer to the memory) we can properly copy one vector into another.

Users can now copy vectors at will, knowing that each one is independent of the one that came before it and if you don’t want them to, we can easily block them and let the compiler tell them the bad news.

Creating Copies
So we can now control what happens with the vector when someone tries to copy it, but what happens in the following situation?

// Function that takes a vector of objects
void DoSomethingCool(MyVector listOfObjects)
{
  // ... Do some stuff with listOfObjects
  // ... Again, lets clear it
  listOfObjects.clear();
}

// Create our vector
MyVector objectList;

// ... Add plenty of objects to it

// Then pass it along
DoSomethingCool(objectList);

Now this might be a simple mistake. We might have wanted to pass by reference rather than value, but that’s irrelevant. By passing by value we have forced the compiler to automatically generate a copy constructor. And this comes with all the problems we had with an automatically generated assignment operator. Especially as the local function goes and clears the vector again…

So we have the exact same solution to the copy constructor problem as we had with the assignment operator…

class MyVector
{
public:
  MyVector();
  ~MyVector();

private:
  // Declare a private copy constructor so nobody can use it
  MyVector(const MyVector& rhs);
  MyVector& operator= (const MyVector& rhs);
};

So again we can make the pass by value safe by either blocking the creation by keeping it private or by moving the copy constructor into the public space and defining a deep copy, making sure that again our vectors are independent.

You can also use the copy constructor to generate new objects at the point of creation, by calling the copy constructor explicitly.

MyVector masterVector;
// Add stuff to the master vector

// Create a new one using the original
MyVector copiedVector(masterVector);

By using the copy constructor explicitly, you may be able to reduce a more expensive call to the default constructor followed by an assignment operator into a single, more efficient call.

Creation or Assignment?
The following might not do what you think it does

MyVector masterVector;
// Add stuff to the master vector

// Create a new one using the original
MyVector copiedVector = masterVector;

Does this call the assignment operator, or the copy constructor, or something else?!?

It’s using the ‘=’ operator, but the important part of the question is that we using this as the point of creation, so we, maybe surprisingly, use the copy constructor in this case.  I’ll look at ways that assignment/creation operations can be made more explicit in a later post.

This is worth remembering as calling the default constructor followed by the assignment operator might be more expensive than simply calling the copy constructor.

Summary

  • The assignment operator and copy constructor are used at different times of an objects lifecycle.
  • MyVector createdVector1(masterVector);  // Uses the copy constructor
    MyVector createdVector2 = masterVector;  // Uses the copy constructor
    createdVector1 = createdVector2; // Uses the assignment operator
  • If you need an assignment operator, or a copy constructor, then you will more than likely need the other. Even if you think you might not, you need to make your class safe for the end user, so adding them both it the best way forward.
  • You should create a private copy constructor and assignment operator by default, only removing them when you need a shallow copy, or defining them if they want a deep copy. So by default all your classes should start life looking like the following (you’re more likely to forget to add them when you don’t need them, than remove them if you do)
  • class DefaultClass
    {
    public:
      DefaultClass();
      ~DefaultClass();
    
    private:
      // Declare private copy constructor and assignment operator,
      // only removing them when we actually need them...
      DefaultClass(const DefaultClass& rhs);
      DefaultClass& operator= (const DefaultClass& rhs);
    };
  • If you only need a shallow copy, then you can remove the private declarations since the compiler can do all the work for you. But only remove them if you explicitly want the objects to be copied.
  • All this relates to structures as well as classes.  But if you are looking at assignment operators and copy constructors for one of your structs, you might be better off creating a class rather than a struct.

The information here only scrapes the surface of how to best use assignment operators and copy constructors, but you can easily investigate how to use these and other automatically generated functions to make your objects more efficient and, most importantly, safe for the users.  Without knowing how and when these class members are used, and how to take advantage of them, you can’t confidently say that the objects you are creating are safe to use in the situations they will be used in.

I’m going to look at a couple of other well hidden features of C++ in future post which will compliment the use of assignment operators and copy constructors quite nicely.


1 thought on “Be Explicit – Assignment Operators & Copy Constructors”

  1. I am surprised by the amount of problems I have seen in code due to the issue you detail.
    Personally when compiling code I crank the warning levels up which would produce two warnings from your pseudo code, due to the unreferenced parameters in the assignment and copy constructor. For this reason I either do one of two things:
    Do not include the parameter only its type.
    or Include the parameter yet with it commented out.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s