I'm doing some testing with flock
and pkill
for a test.sh
script that I'm calling from cron
and I ran into something I don't understand.
The test.sh
is scheduled as a * * * * *
job in cron. Its a very simple script that for testing purposes writes a timestamp to file and then sleeps for 5 minutes. This is to confirm flock
is working well and preventing multiple processes for the same script.
This part is working well as I only see one timestamp showing up per 5 minutes despite the test.sh
being scheduled to run every minute.
Now as a extra safety measure I want to kill the test.sh
(because the script I actually want to use sometimes appears to hang syncing some files to S3 using AWS CLI)
So I figured pkill
would be the easiest as it doesn't require modifying anything to my existing script.
If I run pkill -9 -f test.sh
it says the processes is killed. Running ps aux | grep test.sh
I indeed don't see any test.sh
processes anymore.
However as cron
is supposed to test.sh
every minute, I expect that after killing the process, it would start again after less than a minute.
However it appears that the script doesn't actually restart until the sleep period is over.
So the script initially runs at e.g. 12:00
, sleep will last until 12:05
. If I kill the script on 12:02
I expect it to run again at 12:03
but it's not actually running again until 12:05
which is inline with the sleep period.
Why is this happening? Also, if pkill
is not recommended, is there any other way to kill my processes after a certain amount of time? Preferably without having to edit the original script.
See the following example:
Line 1 opens FD 9 on the lockfile. Line 2's
flock
sets a lock on the FD. Line 7'ssleep
inherits the FD and keeps it being locked. When youpkill
the.sh
script it'll not killsleep
so the FD is still locked untilsleep
finishes. So, to clean up, you need to kill all running processes afterflock
.flock(1)
usesflock(2)
and according toflock(2)
: