C++ fast dynamic type/subtype check

1.7k Views Asked by At

As the title suggests, I am looking for a fast way of runtime typechecking. To illustrate my problem, imagine you have a class hierarchy like the following:

      Base
     /     \
    A       D
   / \     / \
  C   B   F   E
       \ /
        G

My program holds all instances of any class in a single list as Base_ptr because all these classes share common tasks. Now at some point some derived classes will need to know about the existence of an instance of another class. So far so good, I know about dynamic_cast and the typeid()-operator, but both have some mayor drawbacks:

  1. dynamic_cast consumes a lot of processing time if the types are incompatible (e.g. try to cast instances of E to C)
  2. typeid() does not work in "isTypeOrSubtype"-cases, e.g. you need all instances of D or derived from D (so Es, Fs and Gs as well)

The ideal solution would be some kind of "isTypeOrSubtype"-test and only casting, if this test returns successfully. I got an own approach with some macro definitions and precalculated classname hashes, but it is very ugly and hardly maintainable. So I am looking for a cleaner and faster way of dynamic type and subtype checking that can check far more than 20million times per second.

4

There are 4 best solutions below

9
On

If your program knows about all the sub types that will be tested against, you can use a virtual interface that returns a pointer to the sub type. As noted by downvotes and comments, this is not the most flexible approach, since it requires the base class have knowledge of all the derived classes. However, it is very fast. So there is a trade off of flexibility to performance.

class Base {
    //...
    virtual A * is_A () { return 0; }
    virtual B * is_B () { return 0; }
    //...
    template <typename MAYBE_DERIVED>
    MAYBE_DERIVED * isTypeOrSubtype () {
        //...dispatch to template specialization to call is_X()
    }
};

//...
class A : virtual public Base {
    //...
    A * is_A () { return this; }
};

On IDEONE, the suggested technique is 20 to 50 times faster than using dynamic cast.1 The implementation uses macros to allow a new class to be added to a single place, and the proper expansions to the base class occur in an automated way after that.

(1) - I originally clocked it closer to 100 times as fast, but this was without the isTypeOrSubtype() wrapper method that I added to simulate the desired interface.

If flexibility has a higher value than performance, then a slightly less performant solution is to use a map to associate types and corresponding pointer values (having the pointer values removes the need for a dynamic cast). The map instance is maintained in the base class, and the associations are made by the constructors of the subclasses. Whether a regular map or a unordered_map is used will depend on how many subclasses virtually inherit the base class. I would presume the numbers will be small, so a regular map should suffice.

class Base {
    std::map<const char *, void *> children_;
    //...
    template <typename MAYBE_DERIVED>
    MAYBE_DERIVED * isTypeOrSubtype () {
        auto x = children_.find(typeid(MAYBE_DERIVED).name());
        return ((x != children_.end())
                ? static_cast<MAYBE_DERIVED *>(x->second)
                : 0);
    }
};

//...
class A : virtual public Base {
    //...
    A () { children_[typeid(A).name()] = this; }
    //...
};

On IDEONE, this second suggestion is 10 to 30 times faster the using dynamic cast. I don't think IDEONE compiles with optimizations, so I would expect the times to be closer to the first suggestion on a production build. The mechanism as implemented uses typeid(...).name() as the key to the map.2

(2) - This assumes that typeid(...).name() returns something similar to a string literal, and always returns the same string pointer when operating on the same type. If your system does not behave that way, you can modify the map to take a std::string as the key instead, but performance will be degraded.

1
On

dynamic_cast would work wonderfully for this!

Base *instance = //get the pointer from your collection;
A* ap = dynamic_cast<A*>(instance);
D* dp = dynamic_cast<D*>(instance);

if(ap) {
    //instance is an A or a subclass of A
}
if(dp) {
    //instance is a D or a subclass of D
}

This will work for more specific checks as well. So you could check for any type you want.

0
On

Some time ago I used something like this:

// the actual type is irrelevant, const char*, int, ...
// but const char* is great for debugging, when it contains the actual class name
typedef const char* TypeId;

class Base {

    // actually the type id is not the value, but its memory location !
    // the value is irrelevant (but its a good idea to set it to the class name)
    static TypeId s_myTypeId;

public:

    static TypeId* getClassType()          { return &s_myTypeId; }
    virtual TypeId* getInstanceType()      { return &s_myTypeId; }

    static TypeId* getClassBaseType()      { return NULL; }
    virtual TypeId* getInstanceBaseType()  { return NULL; }

    virtual bool isType( TypeId* type )            { return type==getInstanceType(); }
    virtual bool isTypeOrSubType( TypeId* type )   { return isType(type); }

};


template< class MyBase >
class TBase : public MyBase {

    // actually the type id is not the value, but its memory location !
    // the value is irrelevant (but its a good idea to set it to the class name)
    static TypeId s_myTypeId;

public:

    static TypeId* getClassType()          { return &s_myTypeId; }
    virtual TypeId* getInstanceType()      { return &s_myTypeId; }

    static TypeId* getClassBaseType()      { return MyBase::getClassType(); }
    virtual TypeId* getInstanceBaseType()  { return MyBase::getInstanceType(); }

    virtual bool isType( TypeId* type )            { return type==getInstanceType(); }
    virtual bool isTypeOrSubType( TypeId* type )   { return isType(type) || MyBase::isTypeOrSubType(type); }

};

// by deriving from TBase<Base>, a new instantiation of s_myTypeId was created,
// so the class now has its very own unique type id,
// and it inherited all the type resolution magic
class A : public TBase<Base> {
};

// NOTE: class B must not derive directly from A, but from TBase<A>
// imagine a hidden class between B and A,
// actually B inherits from the TBase<A> instantiation, which in turn inherits from A
class B : public TBase<A> {
};

// you will also need to instantiate the static members
// hereby the memory location will be reserved,
// and on execution that memory location becomes the unique type id
#define IMPLEMENT_RTTI(CL) TypeId CL::s_myTypeId = STRFY(CL)

// one per class per source file:
IMPLEMENT_RTTI(Base);
IMPLEMENT_RTTI(A);
IMPLEMENT_RTTI(B);

// example usage:
A a;
B b;

b.getInstanceType()==B::getClassType();     // TRUE
b.getInstanceBaseType()==A::getClassType(); // TRUE
B::getClassBaseType()==A::getClassType();   // TRUE

b.isType( B::getClassType() );              // TRUE
b.isType( A::getClassType() );              // FALSE

b.isTypeOrSubType( B::getClassType() );     // TRUE
b.isTypeOrSubType( A::getClassType() );     // TRUE
b.isTypeOrSubType( Base::getClassType() );  // TRUE

It is safe, fast and easy to use. You just have to obey two rules:

  • do not inherit directly from a class X, but inherit from TBase<X>,
  • and add an IMPLEMENT_RTTI(Me) to source file.

There is one drawback: it does not yet support multiple inheritance. But it would be possible with a few changes.

And probably the TypeId type should be composed like typedef const char* TypeLoc and typedef TypeLoc* TypeId. Maybe just a question of taste.

10
On

I wrote an answer to my own question as this is a different approach to avoid RTTI but no real answer to a fast way of dynamic type/subtype check. This still isn't a clean solution, but the best I could think of until now.

If every class in this hierarchy has the following characteristics, I can skip most of the RTTI.

  1. every class should have a private member: static SecureVector<[class]*> s_Instances; where SecureVector<T> is a thread-safe vector.
  2. at the end of every constructor, s_Instances.push_back(this); should be called, to keep track of a newly created instance of that class
  3. at the beginning of the destructor, s_Instances.erase(this); should be called, to remove this instances reference
  4. every class should have a public function: static const SecureVector<[class]*>& Instances() { return s_Instances; } to get an unmodifiable vector containing all instances of this or any derived class

What this does is, every time a constructor is called, the instance adds itself to its own list of instances. When derived classes call their super constructor, the super class adds itself to its respective list of instances. E.g. if I randomly create 100 instances in the above hierarchy, there would allways be 100 entries in my Base class Instances() vector.

In code this would look like this:

class Base
{
    static SecureVector<Base*> s_Instances; // 1. condition

public:

    Base() 
    {
        s_Instances.push_back(this);    // 2. condition
    }

    ~Base()
    {
        s_Instances.erase(this);    // 3. condition
    }

    static const SecureVector<Base*>& Instances() { return s_Instances; } // 4. condition
};

This is still just as a workaround as the four conditions have to be added manually (or by macro or something like it).