I am running some performance measures between the different network settings using IPerf. I see very drastic differences between two basic setups.
- Two containers (docker) connected to each other via the default docker0 bridge interface in the host.
- Two containers connected via a VPNTunnel interface that is internally connected via the above docker0 bridge.
IPerf calculation for both scenarios for 10sec
**Scenario One (1)**
Client connecting to 172.17.0.4, TCP port 5001
TCP window size: 1.12 MByte (default)
------------------------------------------------------------
[ 3] local 172.17.0.2 port 50728 connected with 172.17.0.4 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 1.0 sec 3.26 GBytes 28.0 Gbits/sec
[ 3] 1.0- 2.0 sec 3.67 GBytes 31.5 Gbits/sec
[ 3] 2.0- 3.0 sec 3.70 GBytes 31.8 Gbits/sec
[ 3] 3.0- 4.0 sec 3.93 GBytes 33.7 Gbits/sec
[ 3] 4.0- 5.0 sec 3.34 GBytes 28.7 Gbits/sec
[ 3] 5.0- 6.0 sec 3.44 GBytes 29.6 Gbits/sec
[ 3] 6.0- 7.0 sec 3.55 GBytes 30.5 Gbits/sec
[ 3] 7.0- 8.0 sec 3.50 GBytes 30.0 Gbits/sec
[ 3] 8.0- 9.0 sec 3.41 GBytes 29.3 Gbits/sec
[ 3] 9.0-10.0 sec 3.20 GBytes 27.5 Gbits/sec
[ 3] 0.0-10.0 sec 35.0 GBytes 30.1 Gbits/sec
**Scenario Two (2)**
Client connecting to 10.23.0.2, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 3] local 10.12.0.2 port 41886 connected with 10.23.0.2 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 1.0 sec 15.1 MBytes 127 Mbits/sec
[ 3] 1.0- 2.0 sec 14.9 MBytes 125 Mbits/sec
[ 3] 2.0- 3.0 sec 14.9 MBytes 125 Mbits/sec
[ 3] 3.0- 4.0 sec 14.2 MBytes 120 Mbits/sec
[ 3] 4.0- 5.0 sec 16.4 MBytes 137 Mbits/sec
[ 3] 5.0- 6.0 sec 18.0 MBytes 151 Mbits/sec
[ 3] 6.0- 7.0 sec 18.6 MBytes 156 Mbits/sec
[ 3] 7.0- 8.0 sec 16.4 MBytes 137 Mbits/sec
[ 3] 8.0- 9.0 sec 13.5 MBytes 113 Mbits/sec
[ 3] 9.0-10.0 sec 15.0 MBytes 126 Mbits/sec
[ 3] 0.0-10.0 sec 157 MBytes 132 Mbits/sec
I am confused as to the high differences in throughput.
Is it due to the encryption and decryption and OpenSSL involved that makes this degradation?
Or is it because of the need for unmarshalling and marshalling of packet headers below the application layer more than once when routing through the VPN tunnel?
Thank You
Shabir
Both tests did not run equally - the first test ran with a TCP window of 1.12 Mbyte, while the second slower test ran with a window of 0.085 MByte:
Thus, it's possible that you're experiencing TCP window exhaustion, both because of the smaller buffer and because of the mildly increased latency through the vpn stack.
In order to know what buffer size to use (if not just a huge buffer), you need to know your bandwidth-delay product.
I don't know what your original channel RTT is, but we can take a stab at it. You were able to get ~30 gbit/sec over your link with a buffer size of 1.12 MBytes, then doing the math backwards (unit conversions elided), we get:
1.12 megabytes / 30 gigabits/sec --> 0.3 ms
.That seems reasonable. Now let's assume your vpn has double the RTT than the original link, so we'll assume a latency of 0.6 ms. Then, we'll use your new window size of 0.085 MByte to figure out what kind of performance you should expect by calculating the bandwidth-delay product forwards:
Well, what do you know, that's about the exact performance you're seeing.
If, for example, you wanted to saturate a 100 gigabit/sec pipe with a RTT of 0.6 ms, then you would need a buffer size of 7.5 Mbytes. Alternatively if you wanted to saturate the pipe not with a single connection but with N connections, then you'd need N sockets each with a send buffer size of 7.5/N Mbytes.