Hats off to one of our competitors here for a brilliant explanation of the difference between NLB and Application Load Balancing. No need to make any bones about it, this is a straight-up excellent piece of writing from Lottie (even though she’s on the other side 😉
Application load balancing (which has also been given other fancy names over the years like content switching or routing, application switching, application or page routing, etc…) is really focused on distributing load across applications intelligently. While it can use ingress variables like IP address and port, it generally doesn’t because that doesn’t offer the insight into which server (application, web, virtual, whatever) is going to be able to respond (has capacity) in a time frame acceptable to the business (response time) for a specific application (or piece of the application like images).
The difference between the two lies primarily in the variables used to distribute load. Network load balancing relies solely on network variables while Application load balancing relies mainly on application variables.
This change in load balancing techniques opened up all sorts of new efficiencies and scalability options because it allowed architectures to specialize – route requests for images to servers focused on serving images, requests for static content to servers focused on serving static content, etc…). It also enabled persistence (sticky sessions) which greatly accelerated the ability to scale out stateful applications in a web format.