I'm trying to make a rope data structure. It's a type of binary tree, i.e. a recursive data structure.
The purpose of a rope is that splitting and concatenation should be fast, which means you avoid copying entire ropes.
So, for example, a user should be able to say rope1 + rope2
and expect a result in ~logarithmic time.
However, this presents a problem:
If a rope is modified, its parents are indirectly modified as well.
Because my goal is to make rope
a drop-in replacement for string
, this is not acceptable.
My solution to this is: Whenever there is any kind of change to a rope
, I would create a new node, with a slight change, leaving the old ones unmodified.
In theory, I think this would work rather well.
In practice, though, it involves a heap allocation for (almost?) every modification of the strings.
Even a one-character change would result in a new heap object, which is not only slow by itself, but which also significantly decreases memory locality, very negatively impacting performance.
How should I go about solving this problem?
If you want to avoid frequent heap allocation performance issues, then I'd suggest using a memory pool for your class that allocates a chunk of memory, and only needs to request a new allocation from the OS when it's full, which should occur pretty infrequently. You can then optimize accesses to the memory pool for allocating small-block data-types like
char
, etc.Andrei Alexandrescu has a great example of a small-block memory allocator in his "Modern C++ Design" book.