C++, FTL, Programming

Type Safe Flags in C++

Using bits and shifting enums is a pretty standard way of storing and reading flags. It’s quick, it’s compact and the syntax isn’t to difficult to pick up for people who haven’t seen it before. But since a bit is only a bit and an enum is only an enum it’s far to easy to make mistakes with these. There is no type checking when you accidentally use the wrong enum and passing an unsigned int around loses all relevance to what these ‘flags’ actually are.

So I wanted to look for a solution that not only stopped us accidentally using the wrong flag with the wrong variable (and allow the compiler to let us know if it happens) but to also to remove the loss of relevance when the flags were passed between different systems.

The Problem
Anyone who’s taken Programming 101 will know what I’m talking about.

enum CharacterBattleProperties
{
  HasTarget = (1<<0),
  CarryingWeapon = (1<<1),
};

enum CharacterState
{
  IsStunned = (1<<0),
  IsPoisoned = (1<<1),
};

unsigned int battleFlags;
unsigned int stateFlags;

// Set or remove some flags based on whats happening
// Character has just lost sight of the player
battleFlags &= ~HasTarget;

// We've just picked up a weapon
battleFlags |= CarryingWeapon;

// ...

// Some more code, some more function calls and
// now we have a player who is stunned
battleFlags |= IsStunned; 

// Whoops, I just set the wrong flag on the wrong variable...

The First Solution
While investigating a couple of solutions I came across a post on Game Architect titled Evolution of a Flag Set Class which examined a lot of the questions I was asking myself. And it took it pretty far, coming up with a set of solutions that more-or-less did what I was looking for, but I wanted something that would be easier to scale and specialise based on the number of flags we had and not one based on an integer as the smallest possible storage unit.

So in trying to keep it as simple as possible the flag set initially ended up with a very simple interface

// Stripping it down to it's most basic behaviour
template < typename TEnum, typename TBaseType = unsigned char >
class flag_set
{
public:
    void set(const TEnum flag)
    {
      m_flags |= (1 << flag);
    }

    void remove(const TEnum flag)
    {
      m_flags &= ~(1 << flag);
    }

private:
    TBaseType m_flags;
};

You’ll notice a couple of things with this implementation (on top of ‘remove’ being highlighted in pink for some probably awesome reason)

  • The first is the default value of the second template parameter is set to an unsigned char, meaning that in the majority of cases the variable can be declared simply with the enum type but it’s easy to increase as needed (though it obviously has limits).
  • The second is that the object itself deals with the shifting of the enum value. This was a conscious decision to remove the idea that when setting a flag you are not ‘setting a bit’ rather you’re setting a property, something I wanted the interface to promote.

And this works pretty well, but contradicts something quite important.

I wanted the flag_set to move away from ‘setting bits’ so why do we have to specify a base type which makes us think about how many bits we need. And since we won’t know how many enum entries we have until we start to set them at runtime, we could end up with a flag_set that doesn’t have enough space to store all our flags and the compiler won’t be able to tell us about it (in real life it asserts when you try to use a flag outside the range of the storage type, but that’s not good enough).

A More Appropriate Solution
So I wanted to look at working with the number of flags, and work on making it have as small a footprint as possible. So taking a step back I started out with the interface I wanted.

template 
class flag_set

Where people could define them as follows

// This will assume there are 8 or fewer flags
flag_set myDefaultSizeFlagSet;  

// This will create enough space for MyFlags::Noof flags
flag_set myDeclaredFlagSet;

So we need a way to create enough storage space for the number of flags we have as compile time and since the base type of a bitset in the FTL is a 32bit unsigned int that’s not a good enough solution. There is no point storing 3 flags in an integer when it could drop into a char. If multiple flag sets were packed together, the smaller the storage space the better.

Creating enough storage space is pretty simple, we store an array of unsigned chars based on how many flags we have.

// Generate the array length based on an 8-bit type
unsigned char m_flags[ ((( TMaxFlags + 7 ) & ~7 ) >> 3) ];

// Figure out which bit to set/remove when needed 
// I'm not worrying about endian code in this example...
const int index = ((( (entry+1) + 7 ) & ~7 ) >> 3)-1;
const int offset = entry - (8*index);

// Set the bit
m_flags[index] |= 1 << offset;

Optimising For Certain Types
So we now have a flag set which allows us to have as many flags as we could possible want (and yes I have seen code with more than 128 ‘flags’ in the past – don’t ask me to justify that, I have no idea…) but with a slight problem. In the original version there were no arrays, no calculations when figuring out which bit to set and no loops to check it’s state.

All of this over complicates what could be just ‘one bit being set on one variable’.

We can easily get around this by specialising the flag set for any types of 8, 16 or 32 bit flags. But I didn’t want to specialise the whole flag_set, especially when we might continue this specialisation up to 64 and 128 bit flags – and specialising a whole class just becomes a nightmare to maintain.

So we extract the storage type into its own structure and specialise that based on the number of flags. This allows the general flag_set behaviour to be defined within the one class, and the length based behaviour to be extracted into it’s own implementation.

// Just a note that the following code is simplified to make it 
// easier to read on a blog post, but the general gist is the same.

template  
struct bit_set
{
  enum
  {
    ArrayLength = ((( TBitCount + 7 ) & ~7 ) >> 3),
  }
  unsigned char m_bitArray[ArrayLength];
};

template <>
struct bit_set<8>
{
  enum
  {
    ArrayLength = 1,
  }
  unsigned char m_bitArray;
};

Then we simply replace our array member in the flag set with an instance of the bit_set (where we calculate the bit_set size as a power of 8 based on the number of flags we specify in the flag_set).

Since the flag_set doesn’t want to worry about what kind of data it’s using, we can replace the in-class implementation with a number of utility functions which not only reduces the amount of code duplication but, when specialised for the same reasons as the bit_set, allows us to have the same interface and to use the same class with only a small number of functions specialised as needed.

// Basic, none specialised function for setting 
// a flag as called from flag_set::set
template 
void set_bit_entry(ContainerT& flags, const uint entry)
{
  // Get the entry point to the bit list
  const int index = ((( (entry+1) + 7 ) & ~7 ) >> 3)-1;
  const int offset = entry - (8*index);
 
  // Set the bit
  flags.m_bitArray[index] |= 1 << offset;
}

// Specilised for a bit count of 8 - no arrays, no additional calculations
template
void set_bit_entry< bit_set >(bit_set& flags, const uint entry)
{
  flags.m_bitArray |= 1 << entry;
}
// So our basic none-specialised flag set looks something like this
// (code simplified to make it a bit more presentable)
template 
class flag_set
{
public:
    void set(const TEnum flag)
    {
      set_bit_entry(m_flags.m_bitArray, flag, true)
    }

    void remove(const TEnum flag)
    {
      set_bit_entry(m_flags.m_bitArray, flag, false)
    }

private:
    static const uint BitSetSize = ((( TMaxFlags + 7 ) & ~7 ) >> 3)*8;
    bit_set m_flags;
};

This method makes is very easy to specialise the flag set for flags that can be stored in a built in type and falling back to an array if the size doesn’t fit into one type. If we want to specialise for 16 or 32 bit flags, then we simply create specialised versions of our utility functions making the flag_set implementation the same no matter how much additional work is being done behind the scenes.

Final Outcome
So we now have a flag set that meets the issues I wanted to resolve head on. Passing a structure with strong type behaviour stops programmers accidentally using the wrong flag with the wrong set (and the compiler’s the one that stops them) and passing a named structure to a function keeps the meaning and reason behind the flags intact.

The code obviously increases in complexity over a simple bitwise operation but from a client point of view, the readability of the client code becomes easier. Calls to .set, .remove and .clear make it very clear what the intention of the code is and makes it easier to avoid simple keyboard mistakes.

It also makes it immeasurably easier to add the debugging features to the flag set, making it easier to see what is set and when. But this post has been long enough and I’ll look at that in my next post.

10 thoughts on “Type Safe Flags in C++”

  1. Very nice post, thanks! It would probably get clearer exactly how easy the code is to use, if you showed an example of the code being used 🙂

    I do think this approach should be weighed against the transparency of a bit-set, also considering that your approach will generate code for every type of flag in your project – though admittedly not much.

    A different approach could be using a union. In C++0x you have strongly typed enums, which allows you to do unions of (unsigned int) enums, thus allowing you to use them as bits. Actually you can do this already, though you may get in trouble with the sign:

    enum eDays { MONDAY=0, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, BEERDAY, HANGOVERDAY };
    union {
    unsigned char flags;
    struct {
    eDays days : 4;
    unsigned char padding : 4;
    };
    };

  2. The use of unions is something that I’m going to compare in a follow up article on how using this flag set makes it easier to debug and check the contents of flags that your currently using. They certainly make it easer to storage usage information when passing the flags around, but they (at the moment) suffer from just as much a lack of type safety as using base types alone.

    I think some of the features coming through in the next C++ standards release might make some elements of the flag set redundant, but as I’ve not had much exposure to them (and some of the compilers I use tend to lag behind by a few years) I doubt I’ll be able to experiment with it any time soon.

    And good point on an actual full working example of the flag set. That would certainly help, so I’ll look at putting up a short set of examples in the next few days which might expand on anything that’s not to clear.

  3. (Reposted from Gamasutra…)

    AHHHHHHHHH!!!! Over design!!!!

    How about:

    enum SomeFlag
    {
    SF_A = 0x01,
    SF_B = 0x02,
    SF_C = 0x04
    }

    SomeFlag operator | (SomeFlag a, SomeFlag b)
    {
    return SomeFlag(int(a) | int(b));
    }

    SomeFlag operator & (SomeFlag a, SomeFlag b)
    {
    return SomeFlag(int(a) & int(b));
    }

    SomeFlag operator ^ (SomeFlag a, SomeFlag b)
    {
    return SomeFlag(int(a) ^ int(b));
    }

    SomeFlag operator ~ (SomeFlag a)
    {
    return SomeFlag(int(a));
    }

    Exmple:

    SomeFlag myFlag = SF_A;
    myFlag = myFlag | SF_B;

    Typesafe, same semantic as ints an no code overhead. If you inline the operators you even get 0 call overhead.

  4. Hi Sean, thanks for the reply.

    Depending on the scope of the problem you’re trying to solve, you’re correct that the scope of the solution could be scaled back. The solution you noted provides some level of type safety, which in some cases may be enough and all that’s required.

    There is still the issue of implicit conversion to int, which can cause problems with function calls and state checks, but if that isn’t an issue then your solution certainly solves the base problem.

    I’m assuming you would wrap the solution up in it’s own class (to avoid writing out the operator overloads for each flag type – though you could write namespace scoped template functions but I think that would restrict the declarations of the enums itself).

    If this was wrapped up in it’s own class then interestingly enough you’d have, concept wise, exactly what I originally implemented (see ‘The First Solution’) though I removed the idea of shifting and setting bits and placed the responsibility of that inside the class (because I explicitly wanted to move away from integer like behaviour as referenced in your reply). I also added the ability to store flags within a defined type, removing the (possibly incorrect) assumption that all combined enumeration values could be stored within an integer.

    At this point our implementations are pretty much the same (depending on scope), but the over-design which you are referring to relates to an additional feature that I desired, though in your case you might not. The ability to pack the storage of the flags into as small a space as possible without the user needing to actually think about it.

    It’s interesting because looking at the code, there is actually not much there, the descriptions and the path that led to the various decisions made are what makes the design of the system seem larger than it actually is (and since the path to the solution was the point of the post I can live with that). And to the end user, the amount of code required to use flag_set is very minimal, something that was important from the outset.

    So the method you suggest certainly provides some of the solution I was looking for, even more so if it was scoped within a strongly typed class, but since we started from there originally it didn’t go as far as we needed, which is why the design of the system goes further than the solution you suggested.

    Again, thanks for taking the time to read the post, and reply to it. It’s the discussion about code and possible implementations that’s often more interesting that the code itself.

    (I posted this on Gamasutra and my blog, as I don’t know which you would read first).

  5. An interesting article and I can see how it would help during compilation – however, how would you access the flags to do a simple test, e.g. if(FLAGS & FLAGONESET) etc.

    From what I can see you would have to write an accessor that would have to calculate the index of the flag and do the test – is that an acceptable overhead for such a simple test?

  6. Hi Mike. You’re right in that there is an explicit function call to check if a flag is set, which mainly has it’s overhead felt in debug builds when inlining is disabled. But when inlined the function overhead is practically reduced to the size of the simple bit check anyway, so where it really matters, the over-head isn’t an issue.

    As an aside, we could have over ridden operator& as a non-member friend function, but decided against this as it goes back to ‘setting and checking’ bits which I wanted to avoid.

    As an aside, I just posted up some example code so you can actually see flags being set, removed and checked using the flag set interface.

  7. Lee that is an interesting and safe solution.
    Couple of really small nitpicks.
    flag_set class methods uses set_bit_entry with a last bool param which is not present in the function.
    static const uint flag_set::BitSetSize could cause problems as it is still required by the standard to have a none inline declaration available IIRC, instead an enum could be used.

  8. Well spotted on the missing parameter in set_bit_entry. The version I use in production does have that additional parameter, to remove the need for a set_bit_entry and a remove_bit_entry function, I simply removed it from the definition of set_bit_entry above to make it easier to read on the blog.

    Regarding flag_set::BitSetSize, I have a follow up post to this regarding the debugging of the flag_set which I’ll hopefully post in the next few days. In that, the BitSetSize has moved into it’s own structure (along with a couple of other properties) and it’s stored as an enum to take advantage of all the benefits that brings.

    Thanks for the comment 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s