How do I delete a single file from a tar.gz archive

48.3k Views Asked by At

I have a huge tarbell archive with an excessively large or corrupt error_log that causes the archive to hang when attempting to extract it. Is there a way to remove this from the archive before unzipping or extract the archive without extracting that specific file on Mac OS X terminal?

I found this post on how to efficiently-remove-files-from-large-tgz however, I tried the --delete flag, but received this error:

tar: Option --delete is not supported

Is there a way to:

  1. remove the file from the archive without unzipping it?
  2. extract the archive but exclude the file?
5

There are 5 best solutions below

0
On BEST ANSWER

As mentioned in the comments it's not possible to remove the file using tar, but you can exclude the file when extracting:

tar -zxvf file.tar.gz --exclude "file_to_exclude"
1
On

Dear you can delete the archive file through the same format as we remove the directory from below command through

command:- rm -rf archive file name r:- recursively

0
On

I wanted to remove the jdk directory from the elasticsearch-oss archive with a one liner, and this is what I came up with:

gzip -d elasticsearch-oss-7.10.1-linux-x86_64.tar.gz -c | tar --delete --wildcards */jdk | gzip - > /tmp/tmp.$$.tar.gz && mv /tmp/tmp.$$.tar.gz elasticsearch-oss-7.10.1-linux-x86_64.tar.gz

I further refined this to include the download:

curl -Ss https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-oss-7.10.1-linux-x86_64.tar.gz | gzip -d - -c | tar --delete --wildcards */jdk | gzip - > elasticsearch-oss-7.10.1-linux-x86_64.tar.gz

Works a treat on ubuntu 20.04, so gnu tar which does not support the @ sign.

0
On

I did that in tree steps. Hopefully will help others in the future.

gzip -d file.tar.gz
tar -f file.tar --delete folder1/file1.txt --delete folder2/file2.txt
gzip -9 file.tar

If you have multiple files use this. But the archives them must have all the files you want to delete, or tar will give a error.

for f in *.tar.gz
do
        echo "Processing file $f"
        gzip -d "$f"
        tar -f "${f%.*}" --delete folder1/file1.txt --delete folder2/file2.txt
        gzip -9 "${f%.*}"
done
2
On

You can repackage it like this:

tar -czvf ./new.tar.gz --exclude='._*' @old.tar.gz

I used ._* to remove all ._files, but you can use any pattern you like, including a full path, directory, filename, or whatever.