Intentionally leak the memory of a std::vector

599 Views Asked by At

I need to find a way to intentionally leak (take ownership of) the internal pointer of a std::vector so that its lifetime surpasses the one of the original container and so that it can be later deleted manually.

Why? I'm working on a networked application using the C ENet library that needs to send large amounts of packets in a short amount of time.

I create network messages by writing the data to a std::vector<unsigned char>.

Then in order to create a "packet," I use the enet_packet_create function, which takes a pointer to a byte array to be sent and its size. In normal mode of operation, the function simply dynamically duplicates the given array on the heap, but there is also a "no allocate" option which only takes the pointer and size, leaving deleting to the creator using a callback function, and that's exactly what I'm trying to achieve -- the data is already there in the vector ready to be used, so there is no need to copy it again, as it could be costly.

4

There are 4 best solutions below

0
On

You don't need to leak anything. Just use the userData field of the ENetPacket structure to store the to-be-deleted std::vector, and just delete it in the callback:

void myCallback(ENetPacket *pkt) {
    std::vector<uint8_t> *data=(std::vector<uint8_t> *)pkt->userData;
    delete data;
}

void sendData() {
    //Create the vector in heap, so it is not destroyed after returning from this function, effectively extending its life until the callback is called.
    std::vector<uint8_t> *data=new std::vector<uint8_t>;
    //Fill data here
    ENetPacket *pkt=enet_packet_create(data.data(), data.size(), ENET_PACKET_FLAG_NO_ALLOCATE);
    pkt->userData=(void*)data;
    pkt->freeCallback=myCallback;

}

The userData void pointer is a usual strategy to hold opaque user data and use it in callbacks, so the user of the library can retrieve the context in which the callback has been called.

It can be anything (void*), from a state holder structure in order to do complex logic after the callback, or just a data pointer which needs to be freed like your case.


From your comments, you say that you don't want to dynamically allocate the vector.

Just remember that any data inside the vector has been dynamically allocated (unless a custom allocator has been used) and the ENetPacket structure has also been dynamically allocated (the passed flag just indicates not to allocate the data, not the structure)


Finally, if you know (or can precompute) the size of the data, a different approach would be to create the packet passing a NULL data pointer.

The function enet_packet_create will create the data buffer, and you can just fill the data directly in the packet buffer, without needing a different buffer and then copying it to the packet.

0
On

This approach is not possible, even if vector<T> provided an interface to let you abscond with its memory. Let's get into why.

Your problem exists because the site where you're going to free the memory is not given arbitrary data. It is only given a pointer to the memory to be freed. If this were not the case, then you'd just pass a pointer to the vector<T> itself to this location, or otherwise smuggle in a vector<T> object itself.

In order to abscond with a vector<T>'s memory and successfully free it, you would have to play by vector<T>'s rules. Which means:

  1. You have to respect the size/capacity distinction. Not all of the memory allocated for a vector<T> actually contains live Ts. So you have to know how many live Ts there are in that memory, so that you can call their destructors properly (we'll get to an issue with that later).

    Now sure, for the very specific case of unsigned char, calling destructors is irrelevant, since they're trivial. But vector<T>'s interface needs to be uniform; if you can abscond with a vector<unsigned char>'s memory, then you must be able to abscond with any vector<T> in the same way. So any absconding interface must provide not just a pointer to the data, but also the size and capacity so that you can properly destroy the members of the container.

  2. You have to respect the Allocator. Remember: the template is vector<T, Allocator>, where Allocator is the type that does the memory allocation/deallocation, as well as creating/destroying the actual Ts in the vector. And since you're allowed to provide specific objects of a particular Allocator instance, any absconding interface must store that specific Allocator object (or copy/move thereof) so that the allocation can be freed.

    Again, the specific case of vector<unsigned char> doesn't care, because the default allocator std::allocator just uses ::operator new/delete to allocate/deallocate memory, and direct placement-new/destructor calls to create/destroy the Ts. But again, a general absconding interface must work with any T and any Allocator. So it must account for all of that.

Which means that, at the end of the day, when you abscond with a vector's memory, that interface must provide an object that stores a pointer to the allocation, the number of live elements in that allocation, the size of that allocation (since the Allocator interface requires that), and the Allocator instance (or copy/move thereof) to use to destroy/deallocate the object.

In short, absconding with a vector<T, Allocator>'s memory means creating a vector<T, Allocator>.

Which you can't do, as stated above. You have arrived at an inherently contradictory situation.

There are two solutions:

  1. Change your code so that you can smuggle in a vector<T> to the location that . This could be done via some global/class-scoped/etc map from pointer-to-data to a vector<unsigned char>*. Or some other mechanism. You'll have to figure it out, because it depends on specific aspects of the system that you have not presented (this is the definition of the XY Problem).

  2. Stop using vector<unsigned char>. Instead, just heap-allocate an array of unsigned char, which you can destroy just fine.

0
On

The following is not an answer! It's yet another attempt to convince you to rethink your approach but it's too long for a comment. (Having said that, I must say that I love this type of hacks when it's just for fun but I hate them even more strongly when they go to production code.)

From the OP, the motivation to use the "no alloc" option is to avoid memory allocation and copying bytes inside enet_packet_create. This brings me the question why using a vector?

If you create a vector but do not fix its its capacity (with reserve or resize) from the beginning and, instead, let it to increase as you add elements, then each time capacity is increased vector will allocate memory and copy bytes which is exactly what you want to avoid.

Perhaps you know from the beginning what the final size of the vector will be. In this case, you can avoid all copies and memory allocations (but one) by reserving that size from the beginning. In this case why not simply using a new[] and delete[] as Quentin has suggested? You wouldn't have to steal memory since it would be yours. Even better, you can create an unique_ptr<unsigned char[]> (consider make_unique<unsigned char[]>), use its release method just before calling enet_packet_create to "steal" the memory and later call delete[] to free the memory.

8
On

I need to find a way to intentionally leak the internal pointer of a std::vector

Only way to leak the internal buffer of std::vector is to leak the vector itself. Example:

std::vector<T>* ptr = new std::vector<T>;
ptr = nullptr; // memory leaked succesfully

But leaking memory is not a good idea in general.

I did not literally mean to create a memory leak, the memory needs to be freed.

In this case, the only solution is to make sure that the lifetime of the std::vector is longer than the usage of the buffer. A vector always releases the buffer it owns on destruction, and there is no way to extract ownership from it except into another vector.

One way to achieve that is this:

// stored somewhere with guaranteed longer lifetime than any packet
std::unordered_map<unsigned char*, std::vector<unsigned char>> storage;

void foo()
{
    std::vector<unsigned char> vec;
    // fill vec here
    unsigned char* ptr = vec.data();
    storage[ptr] = std::move(vec);
    auto destroy_callback = [](unsigned char* ptr) {
        storage.erase(ptr);
    }
    // pass ptr and destroy_callback into some async API
}

You could use a pool allocator to avoid redundant allocations for each packet.

Example adapted form this answer (now that this question has shifted from leaking to transferring ownership, this is close to a duplicate). There's also an alternative suggestion in another answer to that same question which uses a custom allocator that "steals" the ownership