I am copying a LOGFILE to a remote server as it is being created.
tail -f LOGILE | gzip -c >> /faraway/log.gz
However, when the original LOGFILE is closed, and moved to a storage directory, my tail -f seems to get some odd data.
How can I ensure that tail -f stops cleanly and that the compressed file /faraway/log.gz is a true copy of LOGFILE?
EDIT 1
I did a bit more digging.
/faraway/log.gz terminated badly - halfway through a FIX message. This must be because I ctrlCed the whole piped command above.
IF ignore this last line, then the original LOGFILE and log.gz match EXACTLY! That's for a 40G file transmitted across the atlantic.
I am pretty impressed by that as it does exactly what I want. Does any reader think I was just "lucky" in this case - is this likely NOT to work in future?
Now, I just need to get a clean close of gzip. Perhaps sending a kill -9 to the tail PID as suggested below may do allow GZIP to finish its compression properly.
To get a full copy, use
If your don't use
-n +1
option, you only get the tail part of the file.Yet this does not solve the deleted/moved file problem.. In fact, the deleting/moving file problem is an IPC (inter-process communication) problem, or an inter-process co-operation problem. If you don't have the correct behavior model of the other process(es), you can't resolve the problem.
For example, if the other program COPY the log file to somewhere else, and then delete the current one, and the program then log outputs to that new log file... Apparently your tail can not read those outputs.
A related feature of unix (and unix-like system) worth of mentioning:
The missing lines problem you mentioned is interesting, and may need more analysis on the behavior of the programs/processes which generate and move/delete the log file.
--update--
Happy to see you got some progress. Like I said, a process like
tail
can still access data after the file is deleted, in a unix-like system.You can use ( echo $BASHPID > /tmp/PID_tail; exec tail -n + 1 -f yourLogFile ) | gzip -c - > yourZipFile.gz
to gzip your log file, and kill the
tail
program byThe
gzip
should finish by itself without error. Even if you are worried about thatgzip
will receive a broken pipe signal, you can use this alternative way to prevent from the broken pipe:The broken pipe is protected by a
true
, which prints nothing, but ends itself.