Python 3.8 lzma decompress huge file incremental input and output

Question

Python 3.8 lzma decompress huge file incremental input and output

1.4k Views Asked by ThePythonicCow At 28 June 2025 at 05:46

I am looking to do, in Python 3.8, the equivalent of:

xz --decompress --stdout < hugefile.xz > hugefile.out

where neither the input nor output might fit well in memory.

As I read the documentation at https://docs.python.org/3/library/lzma.html#lzma.LZMADecompressor I could use LZMADecompressor to process incrementally available input, and I could use its decompress() function to produce output incrementally.

However it seems that LZMADecompressor puts its entire decompressed output into a single memory buffer, and decompress() reads its entire compressed input from a single input memory buffer.

Granted, the documentation confuses me as to when the input and/or output can be incremental.

So I figure I will have to spawn a separate child process to execute the "xz" binary.

Is there anyway of using the lzma Python module for this task?

Original Q&A

There are 1 best solutions below

**rogdham** · Answer 1

Instead of using the low-level LZMADecompressor, use lzma.open to get a file object. Then, you can copy data into an other file object with the shutil module:

import lzma
import shutil

with lzma.open("hugefile.xz", "rb") as fsrc:
    with open("hugefile.out", "wb") as fdst:
        shutil.copyfileobj(fsrc, fdst)

Internally, shutils.copyfileobj reads and write data in chunks, and the LZMA decompression is done on the fly. This avoids decompressing the whole data into memory.

Python 3.8 lzma decompress huge file incremental input and output

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in XZ

Trending Questions

Popular # Hahtags

Popular Questions