I have a simple node.js server set up that makes calls out to SimpleDB for its persistent data. When my node.js server is hammered with a lot of traffic, it ends up throwing an exception.
The exception that I see is outlined here: https://github.com/joyent/node/issues/2236
I have applied the patch associated, but it only alleviated the issue; it didn't completely fix it. When trying to debug the situation further, my extra console.log lines appeared to make the issue disappear entirely. This led me to believe that it's a timing issue, one that I can describe as:
- [Connection 1] comes in
- [Connection 1] is backgrounded while SimpleDB is contacted for [Data 1]
- [Connection 2] comes in
- [Connection 2] is backgrounded while SimpleDB is contacted for [Data 2]
- [more connections are accepted by node.js]
- SimpleDB responds with [Data 1], but node.js doesn't answer this call yet as there are still new connections being processed
- [more connections are accepted by node.js]
- The SimpleDB connection associated with [Data 1] dies because it hasn't been answered within a timeout.
- [Connection 1] attempts to read [Data 1] but fails, throwing an exception.
When I tail -f
my log, another symptom that I see is a flurry of connections accepted (15-20) and then a pause of all logging (3-5 seconds). The accepted connections all get their responses ASAP after this server pause.
I'm really lost for explanations here, so if anyone can help, that would be great. If you need more information, I can provide it, though I feel like I've provided too much already and may have obscured the actual issue (whatever it may be).