Mapping of file to volume in duplicity/duply?

183 Views Asked by At

Duplicity backs up my files as duplicity-full.TIMESTAMP.vol*.difftar.gz chunks where * is 1,2,3, ... . On the other hand, ~/.cache/duplicity/profile/duplicity-full.TIMESTAMP.manifest contains volumes and file list:

Hostname striker
Localdir /data/pnlpipe3/ukftractography
Volume 1:
    StartingPath   .  
    EndingPath     .git/objects/pack 3188
    Hash SHA1 d77131425a74f6f10eb5bc89ee4277805fb35e68
Volume 2:
    StartingPath   .git/objects/pack
    EndingPath     build/ITK/.git/objects/pack 743
    Hash SHA1 a983bb4e0379d6304da7aec9739a609b0704d270
...
...
Filelist 129500
    new      .git/FETCH_HEAD
    new      .git/HEAD
    new      .git/ORIG_HEAD
...
...

But given a file, is there a command in duplicity to find out which volume contains that file? It is important for retrieving from glacier deep archive. According to your man page, the user must manually migrate the storage type from glacier to standard before being able to retrieve a file. If I do not know which of my volumes contains my file, I won't know which volume to migrate. Migration has to be done by hand clicking through the web interface. So migrating all volumes is not an option either.

2

There are 2 best solutions below

2
On

There is no command to find out which volume contains a file. It is derived internally from the manifest by being between StartingPath and EndingPath of a volume. However, even if you have that info, duplicity would still need to work it's way through the incremental files to restore the file entirely.

So, bottom line is that you need to de-glacier the backup and let duplicity reassemble the file to its former state. See here for an answer on incremental backup that shows how duplicity stores backups.

The original boto+s3 backend would de-glacier the files, but it's been replaced by boto3+s3 which does not have that capability, yet. We are looking for volunteers to port that functionality over.

0
On

Duplicity can't do this, but you can list the files in the volumes:

gpg -o - --decrypt /your/path/to/the/backup/duplicity-inc.20230612T061014Z.to.20230615T062756Z.vol1.difftar.gpg |tar t

the -o - outputs to STDOUT, the --decrypt accepts an encrypted filename. tar t reads from STDIN, and outputs the contents of the tar archive.

If you iterate over all the volumes, you can thus build a file list. Files will appear in multiple volumes, if you have incremental backups.

In my the example you see a volume of an incremental backup. Apparently the files in this can be split up between multiple volumes, and rdiff is supposedly used as tool to pice them together again (haven't tested this)

The documentation is very sparse on how to fix problems and circumvent errors, I must say. If there are other programs out there, that are more mature (as in: If all goes wrong, you have a documentation you can use to stitch things together again), use them.