gzip and pipe to output (performance consideration)

5.5k Views Asked by At

q1) Can i check if I do a gzip -c file | encrypt (some parameters)

a) does gzip print out the output line by line and pipe it to the encrypt function or

b) gzip will be perform 1st, then the output will be pipe all at once to the encrypt function ?

====================================================

q2) Will performing gzip | encrypt have any better performance considerations then gzip, then encrypt

Regards, Noob

2

There are 2 best solutions below

4
On BEST ANSWER

Gzip is a streaming compressor/decompressor. So (for large enough inputs) the compressor/decompressor starts writing output before it has seen the whole input.

That's one of the reasons gzip compression is used for HTTP compression. The sender can compress while it's still generating content; the recipient can work on decompressing the first part of the content, while still receiving the rest.

Gzip does not work "line-by-line", because it doesn't know what a line is. But it does work "chunk-by-chunk", where the compressor defines the size of the chunk.

"Performance" is too vague a word, and too complex an area to give a yes or no answer.

With gzip -c file | encrypt, for a large enough file, will see encrypt and gzip working concurrently. That is, encrypt will be encrypting the first compressed block before gzip compresses the last chunk of file.

4
On

The size of a pipe buffer is implementation dependent. Under SunOS, it's 4kB. That is: gunzip < file.gz | encrypt will move in 4k chunks. Again, it depends on the OS. CygWIN might behave completely differently.

I should add that this is in man 7 pipe. Search for PIPE_BUF.