Buildbot shows pending builds after latent slave shutdown

307 Views Asked by At

We are using BuildBot in conjunction with Amazon EC2 to run nightly UI automation testing, and are running into an issue where some of the builders still show pending builds even though the buildslave finished successfully.

This is an intermittent thing -- it happens only to some of the builders, and different ones each time -- and isn't impacting the runs; it's just annoying to have to go in and manually clear the pending builds each morning. When we only had 20 slaves running, it wasn't a huge issue, but we now have almost 100 running each night, so it's becoming quite a chore.

Here is the basic process flow:

  • Scheduled task puts build request information into Amazon SQS queue.
  • Python script listening on EC2 instance with buildbot installed picks up the message, sends the sendchange command to buildbot to start the build, and sends the spot instance requests to Amazon.
  • Spot instances spin up and call buildslave start to callback to the buildbot master.
  • Instances pick up test information and report results using SQS queues.
  • Python script listening on buildbot master instance logs the test results.
  • Spot instances shut themselves down when the test queue is empty.
  • Builders should complete here, but some still show as pending.

I am looking for a way to either prevent the pending builds from sticking around or, failing that, a programmatic way of clearing them out. I can't find anything useful in the buildbot documentation to help me solve the issue.

0

There are 0 best solutions below