Beanstalkc Timeout Question

994 Views Asked by At

I am using beanstalkc in Python for a queuing process for a program which has to parse a list of URLs. Hence I am using timeout in beanstalk for avoiding huge time consumption by any URL. But even after using it my process doesn't time out in the limit and is taking a lot of time for parsing few URLs. I am using the following code:

for seed in seedlist:
    print 'Put data: %s' % seed
    bean.put(seed,ttr =5)
while True: 
    job = bean.reserve() 
    spider.spider(job.body)
    print 'Got data: %s' % job.body
1

There are 1 best solutions below

0
On

I think you are misunderstanding the purpose of beanstalkd's TTR timeouts. Quoting the beanstalkd FAQ:

How does TTR work

TTR only applies to a job at the moment it becomes reserved. At that event, a timer (called “time-left” in the job stats) starts counting down from the job’s TTR.

  • If the timer reaches zero, the job gets put back in the ready queue.
  • If the job is buried, deleted, or released before the timer runs out, the timer ceases to exist.
  • If the job is "touch"ed before the timer reaches zero, the timer starts over counting down from TTR.

(The job stats of a job that isn’t reserved still contain a “time-left” entry, but its value is meaningless.)

So the TTR does not help you in "avoiding huge time consumption by any URL". It does not magically kill your worker processes. All it does is that beanstalkd puts a job back into the queue, if a worker doesn't mark the job as finished after a given timespan (the TTR).