I am dealing with a CDN system that has a max request rate per minute (since all objects are about the same size, there is no bitrate limit)
Frankly, I do not yet know if it's #/clock minute, or calculated rate.
I have a single daemon that downloads items in threads-on-demand (as opposed to independent workers). This is the proper model for this system.
"They" suggested using an exponential backoff when the limit is hit, but that doesn't make any sense to me. The main use of exponential backoff is to resolve resource collision issues. I suppose if I had independent workers, this might make sense.
But for a single daemon system (again, the proper use model here), why is this better than either waiting for the next clock minute or just using a rate regulation mechanism?
Is there some Knuth "first fit equivalent to best fit" proof that shows this to be a good mechanism? It's certainly the easiest to implement!