I am using a software package which generates serialized Python shelves.

On the remote machine where the shelves are generated, I can open them and process them perfectly. However, when I copy them on my local machine, they cannot be opened anymore.

I traced the problem down to the dbm sub-modules (https://docs.python.org/3.1/library/dbm.html). On remote, when casting dbm.whichdb() on the shelve (format: data.db), the output is 'dbm.ndbm', hence ndbm seems to be installed, and I think it is likely that the third-party Oracle Berkeley DB is used instead, which I read from the source code in the init.py file in the dbm library (as the data format is .db and not .pag, .dir):

def whichdb(filename):
    """Guess which db package to use to open a db file.

    Return values:

    - None if the database file can't be read;
    - empty string if the file can be read but can't be recognized
    - the name of the dbm submodule (e.g. "ndbm" or "gnu") if recognized.

    Importing the given module may still fail, and opening the
    database using that module may still fail.
    """

    # Check for ndbm first -- this has a .pag and a .dir file
    try:
        f = io.open(filename + ".pag", "rb")
        f.close()
        f = io.open(filename + ".dir", "rb")
        f.close()
        return "dbm.ndbm"
    except OSError:
        # some dbm emulations based on Berkeley DB generate a .db file
        # some do not, but they should be caught by the bsd checks
        try:
            f = io.open(filename + ".db", "rb")
            f.close()
            # guarantee we can actually open the file using dbm
            # kind of overkill, but since we are dealing with emulations
            # it seems like a prudent step
            if ndbm is not None:
                d = ndbm.open(filename)
                d.close()
                return "dbm.ndbm"
        except OSError:
            pass
...

On my local machine, running the same code yields a triplet of files, data.bak, data.dat, and data.dir. Calling dbm.whichdb() on them yields 'dbm.dumb'. Casting dbm.whichdb() on the files copied from the remote yields 'None', which means that the database is unreadable or corrupted according to the documentation.

I suspect that I am lacking something to open these databases.

In the dbm library, the dumb.py file is filled with content, however, the ndbm.py only says

"""Provide the _dbm module as a dbm submodule."""

from _dbm import *

and I think that there should be something else that enables to use the ndbm sub-module.

How can I open these ndbm / Berkeley DB databases?

0

There are 0 best solutions below