I'm trying to debug a very simple xmlrpc client/server. The server is running locally on a MacOS running sonoma. Here's the server code for it:
from xmlrpc.server import SimpleXMLRPCServer
def endpoint():
import time
time.sleep(0.02)
with SimpleXMLRPCServer(("0.0.0.0", 8886), allow_none=True) as server:
server.register_function(endpoint)
server.serve_forever()
The client is running within a docker container with image Ubuntu:22.04 on the same host as the server (same behaviour was noted using python:3.9 image.)
import xmlrpc.client
import cProfile
SERVER_PORT = 8886
SERVER_IP="<IP>"
transp = xmlrpc.client.Transport()
transp.timeout = 1
proxy = xmlrpc.client.ServerProxy(f"http://{SERVER_IP}:{SERVER_PORT}/", allow_none=True, transport=transp)
def read(proxy):
import time
times_stuck = []
deadline = time.monotonic() + 60 * 15
while(time.monotonic() < deadline):
profiler = cProfile.Profile()
s_time = time.monotonic()
profiler.enable()
proxy.endpoint()
profiler.disable()
e_time = time.monotonic()
print("[STUCK: {}] [{:.2f}s] endpoint".format(len(times_stuck) ,e_time-s_time))
if e_time - s_time > 1.0:
print("\n>> GOT STUCK: {:.2f}s\n".format(e_time - s_time))
profiler.print_stats(sort='cumulative')
times_stuck.append(e_time - s_time)
time.sleep(0.5)
return times_stuck
If I run the exact same code using a regular virtual environment I cannot reproduce the code getting stuck
The profiler shows an unusual long waiting period on socket connect
[STUCK: 5] [0.03s] endpoint
[STUCK: 5] [64.32s] endpoint
>> GOT STUCK: 64.32s
450 function calls in 64.316 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 64.316 64.316 client.py:1111(__call__)
1 0.000 0.000 64.316 64.316 client.py:1442(__request)
1 0.000 0.000 64.316 64.316 client.py:1150(request)
1 0.000 0.000 64.316 64.316 client.py:1163(single_request)
1 0.000 0.000 64.289 64.289 client.py:1266(send_request)
1 0.000 0.000 64.289 64.289 client.py:1300(send_content)
1 0.000 0.000 64.289 64.289 client.py:1265(endheaders)
1 0.000 0.000 64.289 64.289 client.py:1027(_send_output)
2 0.000 0.000 64.289 32.144 client.py:968(send)
1 0.000 0.000 64.289 64.289 client.py:945(connect)
1 0.000 0.000 64.289 64.289 socket.py:691(create_connection)
1 64.289 64.289 64.289 64.289 {method 'connect' of '_socket.socket' objects}
1 0.000 0.000 0.027 0.027 client.py:1329(getresponse)
1 0.000 0.000 0.027 0.027 client.py:312(begin)
1 0.000 0.000 0.026 0.026 client.py:279(_read_status)
I noticed that sometimes it get's stuck for over 60 seconds, sometimes 30 and other times 15 seconds
the fin_timeout in the container is the default one, and is set to net.ipv4.tcp_fin_timeout=60, so I think something is causing it to hit this timeout.
Tried updating the docker engine, but no luck. Also tried starting the container with different networking modes, but still nothing.
Anyone has any ideas as to what might be causing this?