FTL

Anatomy of a Smart Pointer

I recently implemented smart pointers into the FTL and was looking for a similar set to Boost’s own due to their wide spread use.  Obviously there are a few ways in which smart pointers can be implemented and I thought it would be interesting to document the reasons why I followed a particular route rather than the most obvious (and some would say simplest) method.

Due to the complexities of shared pointers (shared_ptr and weak_ptr etc.) it made sense to start with simply non-copyable pointers which would allow me to nail down the design process before moving on so I ended up with the following specification (obviously based on Boost’s specification)

simple_ptr – A non-copyable pointer with no garbage collection
simple_array – A non-copyable pointer to array with no garbage collection
scoped_ptr – A non-copyable pointer with automatic garbage collection when the pointer leaves scope
scoped_array – A non-copyable pointer to array with automatic garbage collection when the pointer leaves scope

I did originally debate the need for ‘array’ versions of these pointers instead hoping people would prefer the FTL vector or array, which would obviously be much safer than standard C arrays.  But very often people are used to working with memory and need to work at the lowest level possible.  Not giving the _array option will just limit the use of the smart pointers, which will turn people away from using them and generally lead to more unsafe code.

 

Boost Style Pointers
The initial approach was to look at ‘Boost Style’ pointers which in other words are individual, well thought-out classes build for a specific purpose.  Each class is defined and created manually, taking an individual template which specifies the type contained within the pointer.

ftl::ptr::simple_ptr<int> aSimplePointer; 
ftl::ptr::scope_ptr<int> aScopedPointer;

The good thing about this approach is that the template class is as simple as they come, generally only containing one type member and a couple of access functions.  Any other code will act as if this class wasn’t templated and is very easy to read and understand.  Also related to the simplicity of the template is that it is pretty much guaranteed to compile on any platforms that have template support (which probably covers 99.9% of mature C++ compilers and certainly the ones we work with every day).

Unfortunately its simplicity is also its main disadvantage.  Each class (unless you want to use an inheritance approach) is a self contained unit, which means there is a lot of duplication between the different classes.  Operator overloads, memory management etc. will be the same in multiple classes and while the classes can be quite small it does mean fixes need to be made more than once which can lead to copy-and-paste errors.  Unit Testing is also constantly duplicated as you have to test the same code in different classes again and again, doubling the number of changes you need to make for even a simple fix.

It also means that if someone wants to extend a pointer for a specific case they will have to manually create their pointer specification due to the base pointers containing no virtual functions and not supporting inheritance.

It was due to these problems that made me move away from this method.  The rigid structure that they bring to the table and the way projects cannot easily create their own versions lead me to think it would cause frustration when doing anything other than using them as simple pointers. 

 

Template Method Specialisation
To try and limit the amount of duplicated code I looked at how some of the functions that were different could be wrapped up in the same class.  The simplest way to do this would be to use a take on the Template Method Pattern which is designed to provide a class that has 99% the same behaviour but has some elements that needs to behave slightly different.

Using this would allow me to wrap up the _ptr and _array versions of a pointer into the same class and halving the amount of duplication needed to have multiple types.

For example
template<typename T, bool Array = false>
class simple_ptr_type
{
public:
  ~simple_ptr_type(void)
  {
    deallocate_memory<Array>();
  }
private:
  template<bool C>
  void deallocate_memory()
  {
    delete m_ptr;
  }
  template<>
  void deallocate_memory<true>()
  {
    delete [] m_ptr;
  }
};

ftl::ptr::simple_ptr_type<int> aSimplePointer;
ftl::ptr::simple_ptr_type<int, true> aSimpleArray;

This would be an elegant solution which gives the class a bit more flexibility though care would have to be taken to pass in the correct type when declaring them.  We could make this clearer by defining an enum and using this as the templated type

enum EPointerType
{
  EPointerType_Ptr,
  EPointerType_Array,
};

and taking that further, creating a typedef for each pointer type which makes it even clearer (see the next section for problems with this…)

typedef simple_ptr_type<T, EPointerType_Ptr>   simple_ptr;
typedef simple_ptr_type<T, EPointerType_Array> simple_array;

So far so good and this cut down on the main problem we hit when using the initial implementation.  The second template could be quite misleading and requires people to investigate the class to figure out exactly what it’s for but that’s not exactly a serious drawback due to the solutions I’ve already covered.

The main problem comes from the additional template type that is used to decide which method to call.  On some platforms the compiler was smart enough to only compile what was used for each templated type.  On others it compiled all the code regardless, which means that if a particular branch was illegal with a given templated type, it would still cause compiler errors even if it was never used.

Even with those problems I did see the advantages to this but also that it could lead down the following path
template<typename T, bool Array = false, bool Scoped = false, …> class simple_ptr_type

So while having the pointer customisable with various templated types could have it’s uses, it would start to become quite unmanageable (and confusing) and in the end not actually all that extendible since you could only switch it one way or the other.

 

Policy Based Ideas
It was around this time that I finally received a copy of Modern C++ Design by Andrei Alexandrescu.  In one section he discussed a policy design approach to programming where-as the templated types actually provide specific behaviour within a fixed scope.  This started to ring bells with what I was moving towards during the previous implementation where the templates would allow me to customise a ‘basic’ pointer type but also allow others to create the pointers they needed.

So I quickly moved forwards and ended up with something similar to the following

template< typename T,
          class StoragePolicy,
          class DeallocationPolicy,
          class DestructionPolicy,
          class CheckingPolicy,
          class ConversionPolicy>
class smart_pointer_noncopyable
{
};

Now obviously this is a lot of templates and a lot of work to understand – especially when you come to it cold. So it made sense to provide various objects that would come as stock polices that people could put together to create a range of smart pointers suited to their needs.  These policies would also provide the default interface that others could build from.

So as an example of how these policies can be used in the smart pointer object

const_reference operator*(void) const
{
  // This allows the user to provide
  // additional checks when dereferencing
  // and then return the pointer type as
  // a reference (you may be using multiple
  // pointer types which need to be manually
  // checked before parts of it are returned)
  CheckingPolicy::on_dereference( m_storageValue );
  return StoragePolicy::as_reference_type(m_storageValue);
}

~smart_pointer_noncopyable(void)
{
  // Here the deallocation policy can simply
  // call ‘delete’ or ‘delete []’ or maybe does
  // something more extensive with heap pools etc.
  DeallocationPolicy::deallocate_memory( m_storageValue );
}

It was a conscious decision to make the individual polices static member functions rather than internal or inherited objects to reduce the size of the pointer.  If it was to use a component based system where the pointer contained individual objects then the size of the pointer could bloat quite quickly and it would also lead towards inheritance based solutions to other pointer types which is something I wanted to avoid.

You’ll also notice that the pointer type used in the example has the post-fix ‘_noncopyable’ which obviously leads to another type called ‘smart_pointer_copyable’.  It would have been ideal to have a single pointer template and the policies themselves dictate which can be copied and which cannot (which is the whole idea after all!).  This is something I initially looked into where the pointer object’s copy constructor and =operator used polices themselves and these denied copying or asserted when they were used.  Unfortunately this meant that we couldn’t use a compile time check (we would have to wait until the operator was actually used) for the same reasons the template method couldn’t be used (some compilers compiling the whole file regardless of use). So while this still leads to a bit of code duplication between the copy and non-copy variants it still limits it to the structure rather than the actual implementation of the polices.

Obviously this is all good and well, but it would be a colossal pain if we had to manually list all the policies (even if we provided defaults it would still be difficult to change the odd policy).  Typedef’s should be the answer to this so we can manually provide the template type, but the template provides the rest of the policies.

For example (namespaces removed to make it readable on here):
template<typename T>
typedef smart_ptr<T>  simple_pointer;
template<typename T>
typedef smart_ptr<T, policy::DeleteArray>  simple_array;

Which would just be

ftl::ptr::simple_ptr<int> aSimplePointer;
ftl::ptr::simple_array<int> aSimpleArray;

Unfortunately, templated typedefs are not part of the C++ standard (but hopefully this feature will be added sometime in the future), so the only other option was to do the following…

template <typename T>
struct simple_ptr
{
  typedef ftl::ptr::smart_ptr<
                              T
                             > ptr;
};

template <typename T>
struct simple_array
{
  typedef ftl::ptr::smart_ptr<
                              T,
                              policy::DeleteArray
                             > ptr;
};

Which finally leads to

ftl::ptr::simple_ptr<int>::ptr aSimplePointer;
ftl::ptr::simple_array<int>::ptr aSimpleArray;

I really cannot take credit for that little gem and it was a relief when I found a solution otherwise this was yet another implementation that would have been so close but just not good enough. 

So finally we have a nice way of creating basic smart pointer types that allows any other user to customise their pointers to make them ideal for whatever situation they find themselves in.  While the static policy based approach can limit what individual pointers can do it still provides a wide range of extensibility without over-bloating the pointer.

11 thoughts on “Anatomy of a Smart Pointer”

  1. also StoragePolicy, DeallocationPolicy and DestructionPolicy don’t look orthogonal to me

    allocation and deallocation are coupled with storage

    and if DeallocationPolicy::deallocate_memory possibly calls “delete” which consequently calls the object destructor, then what is DestructionPolicy ?

  2. Hi Gregory, thanks for the comments and sorry it has taken so long to reply. I have been out of the country for the past month and I’m slowly getting back on top of everything!

    Regarding the static assertion check. The reason I cannot use it within the template pointer is because some of the compilers we use will parse and compile the entire template file, regardless of whether parts of it are used. This means that, unlike other compilers which will only compile the template’s that are used, the compile error will be triggered 100% of the time, whether the copy constructor is used or not.

    I have a couple of ideas to get around this so I can reduce the base pointer class to one type, but I haven’t had the time to try anything out yet. I’ll write a blog post on what I try and whether it worked or not when I get around to it.

    As for the other point, you are correct in your assumption that some of the policies are not true, stand-alone policies and that some will have cross over (maybe some will come as pairs and can only be used together). This can easily be reduced to a single class by having the same policy object used as both the Storage and Deallocation policy with the required functions for both present in the single class. It might be relevant to eventually reduce the template types into just one Storage policy (in effect losing the Deallocation policy) which is designed to deal with both storage and deallocation, but I didn’t want to restrict the end user at the beginning, rather wait to see how people use the smart pointer and build on their experiences.

    The use of the destruction policy is subtly different from the deallocation policy. The deallocation policy regards the deallocation of the raw data pointed to by the smart pointer (which would be called on the destruction of the smart pointer itself but also in other places) but the destruction policy is only concerned with the destruction of the smart pointer itself. Currently I don’t have any use for this policy in any of the smart pointers I have implemented, but as with the other point, giving the end user a hook into this stage of use could be useful, and it will be down to how people use this as to whether this policy is useful or simply redundant.

    The policy based smart pointer is designed to evolve over time, and I’m more than aware that people will come up with ideas on how to use it that I haven’t even thought about yet – that is the basic beauty of the policy system to start with. It’s always been the plan to evolve this as people use it and and see where it’s limits lie, or maybe where it is needlessly flexible.

    Thanks 🙂

  3. I’m curious about the esoteric compilers you’re referring to

    anyway, I realized that the problem with static assertions consists in the error being spotted where the assertion is tested, not at the line where the copy construction is attempted: I managed to correctly trigger compile time errors with meaningful static assertion messages:
    Test.cpp:71: error: ‘STATIC_ASSERTION_FAILED_AT_LINE_71_this_type_is_non_copyable’ has incomplete type

    but Test.cpp:71 is the line at which the static test is performed, inside the copy constructor of the type 🙂

    about StoragePolicy and DeallocationPolicy: I don’t think they have cross over, I think they are really totally coupled. Maybe having a StoragePolicy which can itself be customized by “sub policies” is the way to go ?

    about DestructionPolicy, thank you for the explanation.
    I’m really curious about the results of hooking at smart pointer destruction stage: I’m using smart pointers as value types copied from places to places so they are created and destroyed a lot through the whole application 🙂

    best,
    g.

  4. trying harder, I managed to get this:

    Test.cpp:73: error: ‘Foo::Foo(const Foo&)::__static_assertion_at_line_73::STATIC_ASSERTION_FAILED_AT_LINE_73_this_type_is_non_copyable’ has incomplete type

    where Foo is defined as
    template
    class Foo
    {
    public:
    explicit Foo(): t(0) {}
    Foo(const Foo& rhs)
    : t(rhs.t)
    {
    STATIC_ASSERT(copyable, this_type_is_non_copyable);
    }

    ~Foo() {}

    T t;

    };

    I would declare this result as good enough

  5. From the look of your results your static assert is probably quite similar to ours. It’s possible to get enough information out of them and as long as you accept that it’s not going to be perfect it is going to be useful.

    As for the destruction policy, in your case there probably isn’t much use for one. Pointers that are created/destroyed quite regularly wouldn’t need this hook, but there may be situations where more widely used pointers, ones that are rarely destroyed (or only destroyed in certain situations) need to know, or act, when it happens. Even if it isn’t used, at least the option is there 🙂

    Sub policies is an interesting idea, the problem I would have there is the sheer complexity of the system if it was taken to any kind of level where it was useful. Policy systems can already be hard to follow, and adding additional levels might make the system more flexible, but possibly a nightmare to actually use.

  6. Hello Lee

    Thank you for the follow up. Indeed additional level of policies makes the system look like it’s more complicated — I wish I was having real template typedefs for ages 🙂

    PS: I’m sorry the code I pasted broke the layout of the site, I thought non HTML characters like < or &gt, would be escaped automatically.

  7. Hello Lee,

    One year has passed. Out of curiosity, are you still satisfied with your design or did you make changes to it?

    Also, I came across http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1681.pdf which avoids multiple inheritance but in his book Alexei Alexandrescu states that smart pointers shouldn’t have member functions. So I don’t really see the point of having policies being able to contribute to the public interface anyway 🙂

    Best,
    Gregory

  8. Hi Gregory

    Thanks for the link. I’m not sure I see personally see policy based smart pointers ever making their way into the standard as I think the scope is simply to wide but it’s an interesting document.

    Our internal policy based smart pointer structure stayed the same until recently when I took some time to have a look at how the smart pointers were being used and how the policies interacted with each other.

    Interestingly the static assert is now working – whether this was due to an upgraded compiler (which we have) or a slightly subtle way in which it’s been implemented I’m not yet sure – but it does mean we no longer have _uncopyable and _copyable base types which is a real bonus.

    This means we have a new ‘CopyPolicy’ policy which simply states if the pointer can be copied or not using a simple enum

    ftl::assert::static_assert::check(“Unable to copy a smart pointer without a valid copy policy”);

    The error message this generates is pretty insane but reading the code is much easier.

    It also means we can have static asserts when performing implicit conversions which is covered in the ‘ConversionPolicy’ which drastically improves the use of the pointer.

    I left the deallocation and destruction policy options to give additional flexibility which is used – deallocation being used when the pointer is replaced and destruction being used when the smart pointer is destroyed. Some implementations have the same destruction and deallocation policies but some don’t.

    The biggest change has been the removal of the storage policy. This was an interesting decision and one that I took knowing that it would reduce the flexibility of the smart pointer quite dramatically, but by doing so the readability and ease of debugging of the code became substantially easier.

    It meant that we do only deal with single pointer types (with more complex storage being carried out in other policy pointers) and also removed the need for the checking policy since we accepted that NULL was always bad on access and not bad when setting the pointer. I have been tempted to keep the checking policy in to allow flexibility on checking for NULL (or other values) when the pointer was changed, accessed or destroyed, but so far I’ve decided against that to keep the policy count down.

    So as I expected the general format of the smart pointer is the same (and we’ve had quite a few pointer implementations based on this that shows it’s flexibility) but as I also expected it needs tinkering with occasionally to make sure it’s staying stream-lined and importantly keeping it easy to use.

    I’m definitely of the opinion that it’s still the right way to go, but you have to be very careful to not ‘over-policy’ the whole thing making it so hard (or clunky) to extend that no-one bothers.

  9. About static asserts, the key is to make them depend on the template parameter:

    STATIC_ASSERT(sizeof(T) < 0, "copy is disallowed");

    instead of

    STATIC_ASSERT(false, "copy is disallowed");

    Anyway, more and more, these days ,I find it tempting to embrace Boost's shared_ptr which for instance features atomic refcount management.

    I like to reinvent my own wheel, even more when it's for my pet projects, but since time flies, as a pilot maybe it's wiser to join STLPort and Boost's fleet 🙂

  10. I believe that was the difference with the newer version of the smart pointer and assertion on use in the copy constructor/operator. It’s drastically improved the way in which people control the behaviour of the pointers for the better.

    I would generally agree with you. For personal projects, where code size and performance isn’t as important as at work, I would dip into Boost or a cross platform STL implementation without a doubt so I can skip to other areas I tend not to have the opportunity to concentrate on at work.reasons.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s