I'm implementing a parser for a binary (image) file format. memoryview seems to be an almost perfect solution for this: The parser class keeps the data in a bytes object and passes sliced memoryviews to code that parses substructures. Sometimes those substructures contain offsets relative to the entire file; this is handily supported by the underlying object being a field of the memoryview, so I can always get a memoryview to any file offset. All my memoryviews are contiguous.
Now, I have a function that returns a bunch of these contiguous memoryviews which correspond to all the image data in the image file (as opposed to the metadata), and I would like to determine which ranges of the underlying bytes object—or, equivalently, the image file—they correspond to. In other words, I would like to extract from the memoryview information corresponding to "it is a view of underlying_bytes[12:100]".
It seems to me that the memoryviews necessarily have internally sufficient information to compute the start and end offsets in the underlying object that they correspond to. However, I can find no method that would help me access this information.
Is there such a method? Would such a method on memoryview be a bad or impossible idea for some reason? Is there a convenient alternative to memoryview that would allow me to do this?
Not a full answer (so if someone has insight or a better solution, I intend to accept that answer), but here's a workaround using Cython. It would be nice to be able to do this without Cython.
Usage: