Load-balancing algorithms control the distribution of traffic load among the available servers in the server farm or caches in a cache farm. This involves two processes, namely: selecting the correct server farm and the right server within the server farm using the destination and source IP addresses, protocol and port number; and the actual dispersing of the traffic load across the available real servers using the current connection load, server response time, available memory, and so forth. There are various types of load-balancing algorithms including Round-robin, Weighted round-robin, Least connections, Weighted Least connections, Faster predictor, Source IP, and Hash address, etc.

In Round-robin load-balancing algorithm, server selection is done in a sequential manner and the load distribution information is based firmly on servers that are available and working. This means that when a request has been made, the next available server in line is selected and the connection request is forwarded. On the other hand, the Weighted Round-robin, which is an expansion of Round-robin, has a weight system that can alter the load distribution process. As a result, the servers with more weight receive more connections, which is an attempt to balance the load distribution on the server farm as some server’s process slower or faster than others.

The Least connections load-balancing algorithm uses the lowest number of concurrent connections among the real servers to determine which server receives the next request. The server with the least connections will be selected after searching the connection table and the request will be forwarded, and the connection table updated after connection has been established. While Weighted Least Connections Predictor is to Least connections but the server with the least connections is allocated a weight. The purpose of adding weights to the least connections is to distribute loads to servers capable of handling more connections.

Faster predictor load-balancing algorithm selects the next real server based on the fastest response, which is eh delta time between the SYN segment and SYN/ACK segment for all successful established connections. Therefore, compared to the previously discussed load-balancing algorithms like Round-robin and Least connections whose server election are done is sequential manner, Faster predictor performs its server selection is a random manner.

Source IP Predictor load-balancing algorithm selects the next real server by using either the source IP address or a hash value that is based on the source IP address. Source IP Predictor can be used for servers and cache farms to increase the chances of cache hit. While Hash Address Predictor load-balancing algorithm selects the next real server by using the hash value that is calculated by using both source and destination IP address. Both predictors can be used to achieve session persistence when no proxy servers are in front of the clients.


References:
Arregoces, M. (2006). Data Center Fundamentals. Cisco Press

Benson, A. (1996). Client/Server Architecture.  2nd Ed. McGrow-Hill

Buyya, R. (1999). High Performance Cluter Computing, Volume 1.
    Prentice Hall-PTR

Brown, K. et al. (2003). Enterprise Java Programming with IBM WebSphere.
Addison-Wesley

Cisco. Network Management: Implementing and Operating High Availability Solutions
 http://www.cisco.com/en/US/technologies/tk869/tk769/white_paper_c11-449655.html

Holden, G. (2003). Guide to Network Defense and Countermeasure. Thomson Course
technology

Marcus, E. & Stern, H. (2003). Blueprints for High Availability. 2nd Ed.
    Wiley

Mohan, C. Caching Technologies for Web Applications, 12 September 2001.
http://www.almaden.ibm.com/u/mohan/Caching_VLDB2001.pdf