Google Cloud AI Platform training job: --stream-logs stalls indefinitely with no output

527 Views Asked by At

I am submitting a training job with

gcloud ai-platform jobs submit training [...] --stream-logs

The job is submitted successfully, but no logs appear on the terminal. No output after "Job [...] submitted successfully." appears, but the command doesn't terminate either.

When I go to check on the online console, I see that the job is running and producing logs (which are not showing up in the terminal where I ran the command). Even when the job completes successfully, the command still doesn't terminate.

The same happens if I first run gcloud ai-platform jobs submit training without --stream-logs and then run gcloud ai-platform jobs stream-logs on the new job.

Do you know any reason this could be happening, and/or any way I can fix it?


EDIT: I left the command running for a while, and about 20 min after the job had already finished and succeeded, all of a sudden all the logs appear and the command terminates. So it is working, kind of, but it takes ages before it gets synced, it seems.

0

There are 0 best solutions below