How to combine multiple tar.gz files to one .tar.gz?

778 Views Asked by At

I want to combine multiple tar.gz files into one .tar.gz file with bash.

I have a cronjob which creates regulary .sql.tar.gz files. Combining them should increase the compression ratio siginficantly.

Meanwhile there are hundereds of them and uncompressing every file would blast the free space.

Is there a way to just append content to an archive? like extract one file, append to archive

#!/bin/bash

tar -czvf all_dbbackups.tar.gz $HOME/dbbackups/

this just adds all .tar.gz and does not create a newly compressed one

1

There are 1 best solutions below

0
Mark Adler On

Yes. You can decompress each .tar.gz, keep all but the last 1024 bytes (which are zeros and which terminate the tar file) of all but the last one, concatenate those together, and then recompress. This can all be done streaming, so only the final result need be saved to storage.

Here is an example in bash:

#!/bin/bash
rm -f merged.tar.gz
for tgz in "$@"
do
    len=$(gzip -dc < $tgz | wc -c)
    let cut="$len-1024"
    gzip -dc < $tgz | dd bs=1 count=$cut status=none | gzip >> merged.tar.gz
done
dd bs=1 count=1024 status=none < /dev/zero | gzip >> merged.tar.gz

Take care, since this does not check for duplicate file names in the multiple tar files. If there are any, only the last one will be retained when the combined .tar.gz is extracted.

It is possible to do something even more sophisticated, which avoids the need to recompress, except for very small amounts at the end of each archive. However there is not room for that answer in this margin.