g++ compiler hints to allocate on stack

209 Views Asked by At

Are there any methods to give the compiler hints that some objects may have a more static behaviour, and allocate things on the stack instead of heap ? For example a string object might have a kind of a constant size inside some functions. I'm asking this because I'm trying to improve performance for an application by using OpenMP. I've already improved the serial part from going from 50 to 20 seconds, and it goes to 12 seconds with parallelism (mentioning that most of the code can be run in parallel). I'm trying to continue improvement. I think one limitation is related to continuous allocation and release of dynamic memory inside the same process. The serial optimizations, so far, were related to merging to a more ANSI C approach, with a more hardcoded allocation of variables (they are allocated dynamically, but considering a worst case scenario, so everything is allocated once). Now I'm pretty much stuck, because I've reached a part of the code which has a lot of C++ approach.

3

There are 3 best solutions below

1
On

The standard std::basic_string template (of which std::string is a specialization) accepts an allocator as its third argument, and you might supply your own stack-based allocator instead of std::allocator, but that would be brittle and tricky (you could use alloca(3) and ensure that all the allocations are inlined; if they are not alloca won't work as you want it.). I don't recommend this approach.

A more viable approach could be to have your own arena or region based allocator. See std::allocator_traits

You could perhaps simply use the C snprintf(3) on a large enough local buffer (e.g. char buf[128];)

2
On

I think you are looking for a small buffer optimization. Detailed description can be found Here. Basic idea is to add an union to the class, that will hold buffer:

class string
{
  union Buffer
  {
    char*    _begin;
    char[16] _local;
  };

  Buffer _buffer;
  size_t _size;
  size_t _capacity;
  // ...
};
0
On

so you are looking for deficiencies by using static analysis to find performance regressions?

It's a good idea, cppcheck has some of those, but those are very rudimentary. I'm not aware of any tool that does that so far.

There are however tools that do different things:

jemalloc

jemalloc has a allocation profiler. (See: http://www.canonware.com/jemalloc/) Perhaps this is of some help to you. I haven't tried it myself sofar, but I would expect it to post object lifetimes and the objects that produce the highest pressure on the allocator (to find the most hurting parts first).

cacheGrind

Valgrind has also a cache and branch prediction simulator. http://valgrind.org/docs/manual/cg-manual.html

clang-check

if you find yourself having too much freetime you can try to run your own checking tools using clang-check.

google perftools

The google perf tools also have a heap profiler. https://code.google.com/p/gperftools/