Load balancing is distributing traffic efficiently among computer network servers so that no individual server/device is overworked. Which is aimed at providing the infrastructure for reliable Internet sites and large corporate networks. For example, in the past some website crashed because of unexpected increase in traffic or user requests. Clustering or using more servers to provide both distribution and high availability can address this problem. When one single computer/server begins to get more requests than usual, some of the requests will be directed to another computer/server with more capacity.
Server load balancing is load balancing of traffic among servers. Server traffic can be dispersed using load balancing algorithms such as round robin, least connections, and shortest response time. This can be done through connection aggregation, which optimizes server-side resources utilization, and through server health checking, which ensures connections are sent only to active users. This type of load balancer acts as a virtual server by assuming the IP addresses used by the service while hiding the actual server IP addresses. As a result the load balancer receives all requests intended for the associated server.

Cache load balancing is the distribution of traffic load across a group of available caches.  Compared to server load balancing, it distributes load among caches while the latter distributes load among servers. However, they are different in that cache load balancing uses cache hit ratio system. A request is a hit when the cache does not need to retrieve it from the server. This caching mechanism aids in speeding the delivery of static content for users as well as offloading repetitive work from servers. Integrated dynamic caching such as Reverse Proxy Cache enables the re-use of content for subsequent requests.