While I was trying to compose test-suite using netperf I had to go through the manual where I came across the below line for option -D
If setting TCP_NODELAY with -D affects throughput and/or service demand for tests where the send size (-m) is larger than the MSS it suggests the TCP/IP stack’s implementation of the Nagle Algorithm may be broken, perhaps interpreting the Nagle Algorithm on a segment by segment basis rather than the proper user send by user send basis
My question is - how can setting TCP_DELAY and it's effect on throughput determines nagle implementation is broken? Can someone help me with a logical explanation on the same?
In broad terms Nagle goes something like:
Is the amount of data in this send, added to any data queued waiting to be sent, larger than the Maximum Segment Size (MSS) of this connection? If the answer is yes, then send the data now, so long as there are no other constraints on sending such as the congestion window or the receiver’s advertised window. If the answer is no, then consider the next step:
Is the connection “idle?” In other words, is there no un-ACKnowledged data out on the network. If so, then send the data in the user’s send now. If the connection is not idle, queue the data and wait. We know that one or more of three things will happen to cause the data to be sent:
The application will continue making send calls and we get to an MSS-worth of data to send.
An ACKnowledgement will arrive from the remote for the currently un-ACKed data making the connection “idle” again.
The retransmission timer for the un-ACKed data will expire and we might piggy-back this data with the retransmission.
The main point being Nagle is supposed to be evaluated on a user-send by user-send basis. If disabling it results in higher throughput with a send size greater than the MSS, it implies the Nagle implementation of that stack is going TCP segment by TCP segment instead.