Understanding Cloud Load Balancers
At the heart of it, a load balancer is a piece of gear (or software) that is designed to help a service remain online and ensure that traffic is less likely to overwhelm the service. Load Balancers can do other, much more extravagant things with traffic too, some of which I will get to later.
For example, suppose I run a website that serves pictures of cats. Because this lives on the Internet where cat pictures are extremely popular, the service potentially draws massive views.
If I host that site simply on the internet with no help from any technology other than the system hosting the site, it is likely that traffic would overwhelm the server and the site would go offline. Much like a denial of service attack, but with real visitors (still a denial of service attack btw). To solve this problem I add another server that is running the same content and the problem should be solved, right? Well, not exactly - if I simply add server 2, there would be no way for visitors to access the new server and server 1 would be overwhelmed again.
I need to use a load balancer to ensure that traffic is handled evenly across nodes. This way, I can ensure, with two servers that 50% of the traffic landing on my cat pictures site is going to server 1 and 50% is going to server 2. Doing this provides service for twice as many visitors, but also allows more scaling to happen - if I need to add more servers to accommodate the growing visitor population, I can do so and simply include the new servers as backend targets for the load balancer and the system will route traffic evenly between all of the nodes in its backend pool.
In the image below, an Application Gateway is placed in front of two instances for the cat pictures service. Because these are web services, I used an App Gateway rather than a Load Balancer, but that is the only reason. This is high level and very general.
The Azure Load Balancer service offering does not includ web application firewall options as it is not intended to be used with Web Service endpoints - using it to load balance across IaaS VMs is a more suitable solution. Load Balancer is also a regional solution - meaning that it will balance traffic across endpoints within the same Azure region.
Azure Application Gateway
Application Gateway is a layer 7 load balancer that can handle SSL offload and Web application firewall (WAF) duty as well. The application gateway is a regional service, meaning that you will need to implement one in each region your services are deployed within and use another service to handle inbound traffic coming to the application gateway for failover across regions.
If you are only running services within North Central US, for example, the Azure Application Gateway can be configured as a load balancer for multiple instances of your service.
Multi-Regional Load Balancing
Azure is a global platform that serves customers, companies, and services around the world, how does load balancing work between regions?
This depends on the services chosen - there are two multi-region service options in Azure, one that is non-https and one that is https capable.
Note: The distinction between https and non-https references the ability of a load balancer to offload SSL traffic, not the ability of the service to sit in front of an https URL. For this reason it is recomended to be used with workloads other than web services - IaaS VMs, Databases, etc.
Azure Traffic Manager
Traffic Manager is a DNS based load balancer that is multi-regional. You can point a URL at Traffic Manager it will send traffic to the appropriate end point(s). The routing occurs only at the domain level for URLs pointing to the traffic manager.
Traffic manager supports the following load balancing options:
weighted: use this routing method when you want a certain percentage of traffic to route to each endpoint (60% to A, 40% to B)
geographical: use this routing method when you want the user to be sent to the closest service to based on where their DNS queries initiate from - this can be used to keep access contained to services within specific countries.
priority: use this routing method when you want one of the service endpoints to be the primary endpoint, simply a failover scenario
performance: use this routing method when you want the user making the request to be sent to the closest service to them
multivalue: use this when endpoints are IPv4 or IPv6 addresses - when this is used, all available endpoint addresses will be returned
Subnet: use this method to ensure that access from certain IP ranges map to specific endpoints for a service.
Azure Front Door
Azure Front Door has traffic manager capabilities built into the service, allowing it to direct traffic across regions and handle failover of a service. It is faster than traffic manager at failover because it is a true layer 7 load balancer and isnt purly reliant on DNS. DNS gets cached by servers all over the Internet and may cause delays in failover due to how DNS operates.
In addition to the multi-regional failover, front door brings content delivery endpoint services (CDN) as well, which can help in caching resources at the edge of the network. This is helpful because the CDN will keep the latest copy of a set of content (think a website) so that other requests coming to the URL can be served from the closest point of presence to the user. The ins and outs of Content Delivery Networks (CDN) may arrive in another post down the road.
The other main thing that front door brings is the WAF capability. Similar to the WAF found in Application Gateway, front door handles this at the edge network - meaning things that trigger a WAF will happen before anything gets to your applications or services.
Because Front Door is a global service, it may also save you some money by not requiring multiple instances to handle multiple regions. A service running in east us and west us for regional redundancy can be handled by one instance of Front Door and might remove the need for application gateway configurations in each region - please consider your environment and needs of your organization before deciding what is right for you.
Mix and Match as needed
As your cloud environments and application services evolve, there is the chance that you may wish to use several of the load balancing services that Azure offers together, and that is absolutely possible - load balancing can be nested should more complex solutions.
Just one more thing…
This discussion of load balancing does not include any discussion of scale-out. App Services can be configured to scale out and deploy a configured number of nodes to deliver a given service. While this is a type of load balancing as the nodes in the service can split up traffic to keep things running smoothly, this is not the intent of scale-out. Scale-Out is designed to keep a single service online in the event one node of that service has failed.
Doing this requires an endpoint to monitor for health checking and allows the Azure platform to keep an eye on the nodes of a service. If one of those nodes does not respond to a health check for a predetermined number of minutes, the unresponsive node is removed and a new one created in its place - by Azure, just for your service.
This post is not intended to cover all of the features and configuration elements of all of Azure load balancing. I might look into that more in future posts, simply because Azure is interesting. From a high level, load balancing is important for high availablity and can possibly save you from a regional outage. In addition, some of Azure’s SLA requirements at once time included requirement for multiple instances, though this is changing some as Azure moves forward.
For more information about any of these services please use the links included above for each service or visit the Azure Documentation