Type erase down to a function call signature without risking wasteful memory allocations?

185 Views Asked by At

I want to have some code that can take any callable object, and I don't want to expose the implementation in a header file.

I do not want to risk memory allocation on the heap or free store (risk of throwing, and performance hit, or I'm in code with no access to the heap).

Not having value semantics is probably good enough: The call with complete before the end of the current scope usually. But value semantics might be useful if not too expensive.

What can I do?

Existing solutions have issues. std::function allocates and has value semantics, and a raw function pointer lacks the ability to transmit state. Passing a C style function pointer-void pointer pair is a pain for the caller. And if I do want value semantics, the C-style function pointer doesn't really work.

1

There are 1 best solutions below

0
On

We can use type erasure without allocation by doing C-style vtables.

First, the vtable details in a private namespace:

namespace details {
  template<class R, class...Args>
  using call_view_sig = R(void const volatile*, Args&&...);

  template<class R, class...Args>
  struct call_view_vtable {
    call_view_sig<R, Args...> const* invoke = 0;
  };

  template<class F, class R, class...Args>
  call_view_sig<R, Args...>const* get_call_viewer() {
    return [](void const volatile* pvoid, Args&&...args)->R{
      F* pf = (F*)pvoid;
      return (*pf)(std::forward<Args>(args)...);
    };
  }
  template<class F, class R, class...Args>
  call_view_vtable<R, Args...> make_call_view_vtable() {
    return {get_call_viewer<F, R, Args...>()};
  }

  template<class F, class R, class...Args>
  call_view_vtable<R, Args...>const* get_call_view_vtable() {
    static const auto vtable = make_call_view_vtable<F, R, Args...>();
    return &vtable;
  }
}

The template iteslf. It is called call_view<Sig>, similar to std::function<Sig>:

template<class Sig>
struct call_view;
template<class R, class...Args>
struct call_view<R(Args...)> {
  // check for "null":
  explicit operator bool() const { return vtable && vtable->invoke; }

  // invoke:
  R operator()(Args...args) const {
    return vtable->invoke( pvoid, std::forward<Args>(args)... );
  }

  // special member functions.  No need for move, as state is pointers:
  call_view(call_view const&)=default;
  call_view& operator=(call_view const&)=default;
  call_view()=default;

  // construct from invokable object with compatible signature:
  template<class F,
    std::enable_if_t<!std::is_same<call_view, std::decay_t<F>>{}, int> =0
    // todo: check compatibility of F
  >
  call_view( F&& f ):
    vtable( details::get_call_view_vtable< std::decay_t<F>, R, Args... >() ),
    pvoid( std::addressof(f) )
  {}

private:
  // state is a vtable pointer and a pvoid:
  details::call_view_vtable<R, Args...> const* vtable = 0;
  void const volatile* pvoid = 0;
};

In this case, the vtable is a bit redundant; a structure containing nothing but a pointer to a single function. When we have more than one operation we are erasing this is wise; in this case we do not.

We can replace the vtable with that one operation. Half of the above vtable work above can be removed, and the implementation is simpler:

template<class Sig>
struct call_view;
template<class R, class...Args>
struct call_view<R(Args...)> {
  explicit operator bool() const { return invoke; }
  R operator()(Args...args) const {
    return invoke( pvoid, std::forward<Args>(args)... );
  }

  call_view(call_view const&)=default;
  call_view& operator=(call_view const&)=default;
  call_view()=default;

  template<class F,
    std::enable_if_t<!std::is_same<call_view, std::decay_t<F>>{}, int> =0
  >
  call_view( F&& f ):
    invoke( details::get_call_viewer< std::decay_t<F>, R, Args... >() ),
    pvoid( std::addressof(f) )
  {}

private:
  details::call_view_sig<R, Args...> const* invoke = 0;
  void const volatile* pvoid = 0;
};

and it still works.

With a bit of refactoring, we can split the dispatch table (or functions) from the storage (ownership or not), to split the value/reference semantics of the type erasure from the operations type erased.

As an example, a move-only owning callable should reuse almost all of the above code. The fact that the data being type erased exists in a smart pointer, a void const volatile*, or in a std::aligned_storage can be separated from what operations you have on the object being type erased.

If you need value semantics, you can extend the type erasure as follows:

namespace details {
  using dtor_sig = void(void*);

  using move_sig = void(void* dest, void*src);
  using copy_sig = void(void* dest, void const*src);

  struct dtor_vtable {
    dtor_sig const* dtor = 0;
  };
  template<class T>
  dtor_sig const* get_dtor() {
    return [](void* x){
      static_cast<T*>(x)->~T();
    };
  }
  template<class T>
  dtor_vtable make_dtor_vtable() {
    return { get_dtor<T>() };
  }
  template<class T>
  dtor_vtable const* get_dtor_vtable() {
    static const auto vtable = make_dtor_vtable<T>();
    return &vtable;
  }

  struct move_vtable:dtor_vtable {
    move_sig const* move = 0;
    move_sig const* move_assign = 0;
  };
  template<class T>
  move_sig const* get_mover() {
    return [](void* dest, void* src){
        ::new(dest) T(std::move(*static_cast<T*>(src)));
    };
  }
  // not all moveable types can be move-assigned; for example, lambdas:
  template<class T>
  move_sig const* get_move_assigner() {
    if constexpr( std::is_assignable<T,T>{} )
      return [](void* dest, void* src){
        *static_cast<T*>(dest) = std::move(*static_cast<T*>(src));
      };
    else
      return nullptr; // user of vtable has to handle this possibility
  }
  template<class T>
  move_vtable make_move_vtable() {
    return {{make_dtor_vtable<T>()}, get_mover<T>(), get_move_assigner<T>()};
  }
  template<class T>
  move_vtable const* get_move_vtable() {
    static const auto vtable = make_move_vtable<T>();
    return &vtable;
  }
  template<class R, class...Args>
  struct call_noalloc_vtable:
    move_vtable,
    call_view_vtable<R,Args...>
  {};
  template<class F, class R, class...Args>
  call_noalloc_vtable<R,Args...> make_call_noalloc_vtable() {
    return {{make_move_vtable<F>()}, {make_call_view_vtable<F, R, Args...>()}};
  }
  template<class F, class R, class...Args>
  call_noalloc_vtable<R,Args...> const* get_call_noalloc_vtable() {
    static const auto vtable = make_call_noalloc_vtable<F, R, Args...>();
    return &vtable;
  }
}
template<class Sig, std::size_t sz = sizeof(void*)*3, std::size_t algn=alignof(void*)>
struct call_noalloc;
template<class R, class...Args, std::size_t sz, std::size_t algn>
struct call_noalloc<R(Args...), sz, algn> {
  explicit operator bool() const { return vtable; }
  R operator()(Args...args) const {
    return vtable->invoke( pvoid(), std::forward<Args>(args)... );
  }

  call_noalloc(call_noalloc&& o):call_noalloc()
  {
    *this = std::move(o);
  }
  call_noalloc& operator=(call_noalloc const& o) {
    if (this == &o) return *this;
    // moveing onto same type, assign:
    if (o.vtable && vtable->move_assign && vtable == o.vtable)
    {
      vtable->move_assign( &data, &o.data );
      return *this;
    }
    clear();
    if (o.vtable) {
      // moveing onto differnt type, construct:
      o.vtable->move( &data, &o.data );
      vtable = o.vtable;
    }
    return *this;
  }
  call_noalloc()=default;

  template<class F,
    std::enable_if_t<!std::is_same<call_noalloc, std::decay_t<F>>{}, int> =0
  >
  call_noalloc( F&& f )
  {
    static_assert( sizeof(std::decay_t<F>)<=sz && alignof(std::decay_t<F>)<=algn );
    ::new( (void*)&data ) std::decay_t<F>( std::forward<F>(f) );
    vtable = details::get_call_noalloc_vtable< std::decay_t<F>, R, Args... >();
  }

  void clear() {
    if (!*this) return;
    vtable->dtor(&data);
    vtable = nullptr;
  }

private:
  void* pvoid() { return &data; }
  void const* pvoid() const { return &data; }
  details::call_noalloc_vtable<R, Args...> const* vtable = 0;
  std::aligned_storage_t< sz, algn > data;
};

where we create a bounded buffer of memory to store the object in. This version only supports move semantics; the recipie to extend to copy semantics should be obvious.

This has an advantage over std::function in that you get hard compiler errors if you didn't have enough space to store the object in question. And as a non-allocating type, you can afford to use it within performance critical code without risking allocation delays.

Test code:

void print_test( call_view< void(std::ostream& os) > printer ) {
    printer(std::cout);
}

int main() {
    print_test( [](auto&& os){ os << "hello world\n"; } );
}

Live example with all 3 tested.