I've to simulate load on a web application.
I wrote a python code that generate random requests following an exponential distribution.
The request is a simple url-get: i measure the response time and i store it on a file.
So, for a given time the code create a new process that perform the request, then he sleeps for a random time given by random.expovariate(lambd).
When i start a request i also store a timestamp to check if the average is close to 1/lambda.
I've problem when i set lambda > 20: the average is more high than 1/lambda and this results in a slow execution.
I test the random generator and it is very good, so i think the problem is when the system has to create a new process.
Is there a way to speed up this phase?
Perhaps there are some limits on processes creation?
I forgot to say that the python version is 2.7.3 and i can't upgrade it.
Using pypy there are some performance improvements but problem persist.
Here the code:
def request(results,url):
start = time.time()
try:
r = requests.get(url)
except:
noactions
else:
# Append results (in seconds)
results.write("{0},{1}\n".format(start,r.elapsed.microseconds/1e6))
def main():
# Open results file
results = open(responseTimes.txt",'a')
processes = []
# Perform requests for time t (seconds) with rate lambda=l
start = time.time()
elapsed = 0
while (t > elapsed):
p = Process(target=request, args=(results,url,))
p.daemon = True
p.start()
processes.append(p)
time.sleep(random.expovariate(l))
elapsed = time.time() - start
# Wait for all processes to finish
[p.join() for p in processes]
# Close the file
results.close()
if __name__ == "__main__":
main()
Analysis
It is likely that you are creating too many processes and that slows the system down. While creating a process usually takes less than 1.0/20 seconds, when there are too many this may well increase above 1.0/20, resulting in the observed rate:
These numbers may vary on your machine depending on factors like work load, CPU, memory etc.
Solution
Thas said, you should rather use an approach whereby you start say 10-20 processes (or even better: threads, as they take less overhead), each of which issues requests at a rate of 1/lambda: