I'm getting results I can't explain when making a ZeroMQ REQ/REP request with polling.
The elapsed time, taken by the poll()-method, is influenced by a time.sleep() elsewhere in the code.
I have a server process which runs a command, and returns the pid of the process. This is the request side code which has the issue:
import datetime, sys, time, zmq
sleeptime = float(sys.argv[1])
def log_msg(msg):
print "{0}, {1}".format(datetime.datetime.utcnow().strftime('%Y%m%d_%H:%M:%S.%f'), msg)
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://{0}:{1}".format('myserver', 9980))
poller = zmq.Poller()
poller.register(socket, zmq.POLLIN)
req = { 'req_type':'raw_cmd', 'cmd':'echo hello', 'block':0, 'timeout':300, 'return_output': 0}
for i in range(4):
time.sleep(sleeptime)
socket.send_json(req)
start = time.time()
socks = dict((poller.poll(30000)))
elapsed = (time.time()-start)*1000
rep = socket.recv_json()
log_msg('pid={0}, sleep={1}, time taken={2}'.format(rep['pid'], sleeptime, int(elapsed)))
The first poll elapsed time is quite variable, but all the subsequent ones take about 2 seconds less the sleep time, unless sleep time is 0, in which case it's fast, so:
sleep = 0.5 :
20201008_08:27:24.168800, pid=52528, sleep=0.5, time taken=505 20201008_08:27:26.210196, pid=52529, sleep=0.5, time taken=1540 20201008_08:27:28.250891, pid=52530, sleep=0.5, time taken=1539 20201008_08:27:30.295036, pid=52531, sleep=0.5, time taken=1543
sleep = 1.5 :
20201008_08:44:02.474492, pid=54730, sleep=1.5, time taken=295 20201008_08:44:04.516844, pid=54731, sleep=1.5, time taken=540 20201008_08:44:06.557980, pid=54732, sleep=1.5, time taken=539 20201008_08:44:08.599717, pid=54733, sleep=1.5, time taken=539
sleep = 0 :
20201008_08:27:13.999147, pid=52513, sleep=0.0, time taken=690 20201008_08:27:14.033915, pid=52514, sleep=0.0, time taken=34 20201008_08:27:14.068803, pid=52515, sleep=0.0, time taken=34 20201008_08:27:14.103947, pid=52516, sleep=0.0, time taken=34
So question is, why does the time.sleep() affect the time taken by the ZeroMQ poller.poll()?
The code as-is mixes apples with oranges:
time.time()returns seconds since epoch asfloat, i.e.( time.time() - start )*1000yields how many[ms]milliseconds the code-section under test actually took.Whereas the
poller.poll()-method expects milliseconds for it'stimeout, so the requested30000 [ms]makes thepoller-instance wait and hold until30 [s]elapse or a message-arrival relatedzmq.POLLIN-event permits to step forwards anywhere earlier.This said, your measurement explores not the
.sleep()-method, but intervals taken at random by an external process, that sendsREP-side answers toREQ-side.send()-s.Still in doubts? Set sleeptime to
31 [s]and you ought receive almost zero durations reported from an ill-formulated local time-bracket (sure, if and only if the remoteREP-lying entity did not crash of suffocate during such ops).You might also like
zmq.Stopwatchclass, having{ .start(), .stop() }-methods, that provide a smart tool for using a ZeroMQ native[us]-resolution-clock. Most of my posts on ZeroMQ performance use this for microbenchmarking.