Techniques/algorithms used in WAN optimization

2.8k Views Asked by At

What are the techniques/algorithms used in WAN optimization? I am looking for a reference which can give a good theory supported with code examples, I have taken a look in Steelhead manual from Riverbed and I found the following main techniques used in:

  • SDR (Scalable Data Referencing): which breaks up TCP data into unique data chunks, each chunk has a reference number, where when the same byte sequence occurs in future transmission, the reference number is only sent across the LAN instead of raw data chunks.

  • Connection pooling: The product creates pools of idle TCP connection (for HTTP as example), where when a client tries to create a new connection to a previously visited target, it uses one from its pool, which, in turns, overcomes three-way TCP handshake.

  • The product reduces the number of round trips over WAN for common actions (opening/editing remote shared files/folders), it supports most of intended protocols: CIFS, MAPI, HTTP … etc.

  • Data compression.

Through my search I found 3 open source projects aim to do WAN optimization, these are:

TrafficSqueezer seems to have more features but the comments in its page in sorceforge do not give a good sense about it. I tried to find a document within these projects with good info but I couldn't.

1

There are 1 best solutions below

3
On

the techniques that can reduce the traffic amount most - are of course compression and data deduplication (both WAN optimisers built up the same data based on a algorithm on memory or HDD - as soon as there is again the same traffic pattern - the pattern is replaced with a pointer to the data and a length - therefore you can save up to 99% when you transfer the same file twice, but even different files have a lot of common data where deduplication can optimise a lot!). (you will find a lot of sources on the web: e.g. http://www.computerweekly.com/feature/How-data-deduplication-works) in your example this is technique called SDR.

Riverbed has also a lot of protocol support - which makes for e.g. CIFS, SMB and MAPI more delay aware (e.g. a lot of packages are buffered and sent once - so save roundtrips) Also F5 does e.g. FTP and HTTP optimisations to get those more performant.

when there is a lot delay on the WAN link - of course you can also save time with connection pooling - so pre-established TCP sessions (you can save the time that would be needed for a tcp 3way handshake)

so at a glance: -data deduplication -connection pooling -compression -protocol optimisation

i am sure you can find a lot in the f5 doku (F5 WOM is the product), bluecoat does offer WAN optimisation as well and of course Riverbed. also silverpeak might be worth a try. for the opensouce ones i only have experiences on traffic squeezer, but there hasn't been a comparable feature-set to commercial products this time.