I'm trying to create a Python code that pulls Hashi corp Nomad tasks logs for raw_exec driver incrementally (pull new logs since last pull). The incremental pull is done every 10-20seconds.
To achieve it I'm using the last offset since last pull as described in the API reference:
I'm using this to pull the 1st log while the startoffset=0" and for the next ones I'm using the last offset I'm getting
response2 = requests.get(f"{nomadapiurl}client/fs/logs/{alloc['ID']}?task={task}&type={logtype}&origin=start&offset={startoffset}")
Logs files for the task are limited in size and rotate every few minutes and since the offset is not relative to specific file I'm not sure how I can pull the new logs generated since the last pull.
any help will be appreciated.
Nomad does not keep state of the log files. You have to save the stream content, diff it, find the position in the logs where you last were, and start streaming from there. You can download the whole log file with
nomad alloc fs.The
nomad alloc logs --helpeven has this:The important "best efforted" means that it opens the log stream, waits for some time, and after that time only starts printing the logs, hoping these are only fresh log lines.
You might be interested in implementing Nomad log collection, for example with Grafana Loki and promtail. To solve the problem you are having, promtail runs on the host, connects to text log files in Nomad data directory, and then promtail can save the end file positions to know where it ended streaming.
A bit of self-promotion, in my
nomad-watchproject you should be able tonomad-watch -f -n0 job thejobto stream from only newest logs. What it does it wait for 0.5 seconds to let nomad transfer the logs and only then starts streaming logs. But in this case it should do exactly the same asnomad alloc logs -tail -n 0.Also https://github.com/hashicorp/nomad/blob/61941d820448d1b83e16f726c51c14cab30986e1/command/alloc_logs.go#L238 and https://github.com/hashicorp/nomad/blob/61941d820448d1b83e16f726c51c14cab30986e1/command/alloc_logs.go#L295 . Nomad uses a delay of 1 second
, 1*time.Second)before starting to stream "new" logs.Bottom line, what is really missing is any kind of structure to the logs that would store the timestamp of the line printed. Consider making (or find if exists one) an issue https://github.com/hashicorp/nomad/issues, so we can upvote.