C++ - Managing References in Disk Based Vector

222 Views Asked by At

I am developing a set of vector classes that all derived from an abstract vector. I am doing this so that in our software that makes use of these vectors, we can quickly switch between the vectors without any code breaking (or at least minimize failures, but my goal is full compatibility). All of the vectors match.

I am working on a Disk Based Vector that mostly conforms to match the STL Vector implementation. I am doing this because we need to handle large out of memory files that contain various formats of data. The Disk Vector handles data read/write to disk by using template specialization/polymorphism of serialization and deserialization classes. The data serialization and deserialization has been tested, and it works (up to now). My problem occurs when dealing with references to the data.

For example,

Given a DiskVector dv, a call to dv[10] would get a point to a spot on disk, then seek there, read out the char stream. This stream gets passed to a deserializor which converts the byte stream into the appropriate data type. Once I have the value, I my return it.

This is where I run into a problem. In the STL, they return it as a reference, so in order to match their style, I need to return a reference. What I do it store the value in an unordered_map with the given index (in this example, 10). Then I return a reference to the value in the unordered_map.

If this continues without cleanup, then the purpose of the DiskVector is lost because all the data just gets loaded into memory, which is bad due to data size. So I clean up this map by deleting the indexes later on when other calls are made. Unfortunately, if a user decided to store this reference for a long time, and then it gets deleted in the DiskVector, we have a problem.

So my questions

  • Is there a way to see if any other references to a certain instance are in use?
  • Is there a better way to solve this while still maintaining the polymorphic style for reasons described at the beginning?
  • Is it possible to construct a special class that would behave as a reference, but handle the disk IO dynamically so I could just return that instead?
  • Any other ideas?
1

There are 1 best solutions below

0
On

So a better solution at what I was trying to do is to use SQLite as the backend for the database. Use BLOBs as the column types for key and value columns. This is the approach I am taking now. That said, in order to get it to work well, I need to use what cdhowie posted in the comments to my question.