C1
, C2
,...
are callback classes.
They derived from a common interface CBase
with the callback CBase::f()
.
All of them override CBase::f()
with final
modifier.
I have to register ~50 instance of any class that derived from C1
, and ~50 instance of any class that derived from C2
.
(see @@
in the below code for example)
Main objective: When I call allF()
, C1::f()
/ C2::f()
of every registered instances have to be called.
Here is a simplified version, it works (Full demo) :-
#include <iostream>
#include <vector>
class CBase{
public: virtual void f(){std::cout<<"CBase"<<std::endl;}
};
class C1 : public CBase{
public: virtual void f() final{std::cout<<"C1"<<std::endl;}
};
class C2 : public CBase{
public: virtual void f() final{std::cout<<"C2"<<std::endl;}
};
This is the callback registering :-
//-------- begin registering -----
std::vector<CBase*> cBase;
void regis(CBase* c){
cBase.push_back(c);
}
void allF(){ //must be super fast
for(auto ele:cBase){
ele->f(); //#
}
}
int main() {
C1 a;
C1 b;
C2 c; //@@
//or ... class C2Extend : public C2{}; C2Extend c;
regis(&a);
regis(&b);
regis(&c);
allF(); //print C1 C1 C2
}
Problem
According to the profile result, if I can avoid the v-table cost at #
, I would get significant performance gain.
How to do it elegantly?
My poor solution
A possible workaround is : create many arrays to store each CX
(Full demo):-
//-------- begin registering -----
std::vector<C1*> c1s;
std::vector<C2*> c2s;
void regis(C1* c){
c1s.push_back(c);
}
void regis(C2* c){
c2s.push_back(c);
}
void allF(){ //must be super fast
for(auto ele:c1s){
ele->f(); //#
}
for(auto ele:c2s){
ele->f(); //#
}
}
int main() {
C1 a;
C1 b;
C2 c;
regis(&a);
regis(&b);
regis(&c);
allF(); //print C1 C1 C2
}
It is very faster.
However, it is not scale well.
After a few development cycle, C3
,C4
, etc were born.
I have to create std::vector<C3*>
,std::vector<C4*>
, ... manually
My approach lead to maintainability hell.
More information (edited)
In the worst case, there are at most 20 classes. (C1
to C20
)
In real case, C1
,C2
,... are special type of data-structures.
All of them require special initialization (f()
) at a precisely-correct time.
Their instances are constructed at various .cpp
.
Thus, an array storage std::vector<CBase*> cBase;
caching all of them would be useful.
For example, C1
is map 1:1
, C2
is map 1:N
, C3
is map N:N
.
Together with a custom allocator, I can achieve unearthly data locality.
More note: I don't care about order of callback. (Thank Fire Lancer)
Your "poor solution" starts looking much better when you automate it using templates. Our goal: store
c1s
,c2s
, etc in a single vector.To do this, we need to map derived types to consecutive integers. A simple way to do that is to use a global counter, and a function template that increments and stores it every time it is instantiated.
The first call to
indexForType<T>()
will reserve a new index forT
, and return the same one on subsequent calls.Then, we need a way to erase enough information about our callback vectors so we can store them and call the correct
f
on them.call
will hold a function that iterates over the pointers, downcasts them and callsf
. Just like your solution, this factors out all of the calls to a single type into only one virtual call.CbVec
could holdCBase *
instead ofvoid *
, but I'll explain that choice later.Now we need a function to populate
groups
upon requesting aGroup
for some type:Here you can see that we use a lambda expression to generate the downcasting functions. The reason I've chosen to store
void *
's instead ofCBase *
's is that the performance-sensitive downcast in there becomes a no-op, while a base-to-derived cast might have required pointer adjustments (and further complications in case of virtual inheritance).Finally, the public API. All of the above has been defined inside
namespace detail_callbacks
, and we just need to put the pieces together:And there you go! New derived callbacks are now automatically registered.
See it live on Coliru