Azure Load Balancer: High-Availability and Layer 4 Traffic Distribution

Azure Load Balancer: High-Availability and Layer 4 Traffic Distribution

To build resilient cloud workloads, high availability must be engineered directly into the network architecture. While Azure Application Gateway and Azure Front Door manage application-aware traffic at Layer 7, core network protocols require high-throughput, ultra-low-latency traffic distribution deeper down the networking stack.

Azure Load Balancer is a high-performance, ultra-low-latency Layer 4 (Transport Layer) load balancer that delivers inbound and outbound traffic distribution for all UDP and TCP protocols. Operating at the foundational levels of the network, it scales to handle millions of requests per second while ensuring your backend services remain highly available and resilient.

Key capabilities include:

- High availability across Availability Zones.
- Scalability to millions of flows for TCP/UDP apps.
- Health probes to route traffic only to healthy instances.
- Multiple frontend IPs and port configurations.
- IPv6 support and integration with Gateway Load Balancer.


Introduction

What is Azure Load Balancer?

Azure Load Balancer is a software-defined, multi-tenant network load balancer that manages incoming and outgoing network traffic. It distributes IP traffic across a pool of healthy backend instances (such as Virtual Machines or Virtual Machine Scale Sets) based on a configured 5-tuple hash (Source IP, Source Port, Destination IP, Destination Port, and Protocol Type).

Because it operates at Layer 4, it is entirely agnostic to the application content, payload, or URL path. It processes raw data packets, allowing it to handle massive data volumes with sub-millisecond latencies.

Create a Public Load Balancer

A Public Load Balancer maps a public IP address and port number of incoming internet traffic to the private IP address and port number of a backend instance. This allows you to expose specific services safely to the public internet while distributing the load across multiple underlying servers.

Create an Internal Load Balancer

An Internal (or Private) Load Balancer is restricted strictly to traffic originating inside your private Virtual Network (VNet) or over a hybrid cross-premises connection (VPN/ExpressRoute). It relies on a private IP address from your subnet's allocation pool to load balance traffic between internal application tiers, such as routing traffic safely from a web frontend tier down to a private middleware or database tier.


Overview of the OSI Model

To master Azure load balancing options, you must understand exactly how data moves across the Open Systems Interconnection (OSI) model. The OSI model isolates network operations into seven distinct abstraction layers.


Understanding the 7 Layers

  • Layer 1: Physical Layer: The physical hardware medium, electrical signals, fiber optic cables, or radio waves that transmit raw, unstructured binary data bitstreams across the network.
  • Layer 2: Data Link Layer: Responsible for node-to-node data transfer. It packages raw bits into structured Frames and utilizes Physical Addresses (MAC Addresses) to manage error detection and flow control across a single local physical segment.
  • Layer 3: Network Layer: Manages host addressing and routing across multi-node networks. It packages data into Packets and reads logical addresses (IP Addresses) to determine the most efficient physical path for data travel. Devices: Traditional routers.
  • Layer 4: Transport Layer: Ensures end-to-end communication, session flow control, and data reliability. It segments data into Segments (TCP) or Datagrams (UDP), tracking connection streams via Port Numbers (e.g., Port 80, 443, 22). Devices: Azure Load Balancer.
  • Layer 5: Session Layer: Establishes, manages, maintains, and terminates authentication sessions and continuous dialogues between local and remote applications.
  • Layer 6: Presentation Layer: Acts as the data translator. It handles data formatting, syntax conversion, encryption/decryption (like SSL/TLS), and data compression to ensure the application layer can read the incoming stream.
  • Layer 7: Application Layer: The layer closest to the end user. It interacts directly with software applications to provide network application services, parsing protocols like HTTP, HTTPS, FTP, SMTP, and DNS. Devices: Azure Application Gateway, Azure Front Door.


End-to-End Traffic Flow Example

Imagine an end user types https://www.otechy.com into their browser:


  1. Application Layer (L7): The browser generates an HTTP GET request payload.
  2. Presentation Layer (L6): The data is formatted and encrypted via TLS.
  3. Session Layer (L5): A continuous application session stream is established.
  4. Transport Layer (L4): The OS attaches a TCP header, defining a random ephemeral source port and specifying target destination Port 443. This is where Azure Load Balancer intercepts, evaluates headers, and makes routing selections.
  5. Network Layer (L3): An IP header is attached containing the client’s source IP and the server's destination IP Address.
  6. Data Link Layer (L2): The packet is wrapped into a frame containing local hardware MAC addresses.
  7. Physical Layer (L1): The frame is converted into electrical or optical signals and transmitted across the internet.
  8. The Destination: The packet is received at the target node, navigating in reverse up the stack to render the web page.


⚠️ Important Note!

  • Standard SKU Requirement: Always choose the Standard SKU for production workloads. The Basic SKU lacks support for availability zones, lacks advanced outbound rules, features slow health probes, and does not integrate with secure network security parameters.
  • Network Security Groups (NSGs): Azure Standard Load Balancer is secure by default. If you do not explicitly bind a Network Security Group (NSG) to your backend VM network interfaces (NICs) containing rules that permit the inbound traffic, all data frames hitting the frontend IP will be completely blocked.


Load Balancer Core Configuration & Advanced Topologies


Create a Multiple VMs Inbound NAT Rule

Inbound Network Address Translation (NAT) rules allow you to route external traffic hitting a single public frontend IP address on specific high-range ports straight to a designated port on a unique Virtual Machine inside your backend pool. This allows you to expose distinct management interfaces (like RDP on Port 3389 or SSH on Port 22) across multiple internal virtual machines using only a single shared public IP asset.


Load Balance Within Specific Availability Zones

For high-density regional setups, you can bind your Load Balancer components to a single, hyper-isolated Availability Zone (e.g., Zone 1). In this zonal deployment configuration, the public IP or internal frontend architecture targets only the specific computing resources deployed strictly within that identical physical datacenter footprint.


Load Balance Across Multiple Availability Sets

For legacy architectures or workloads that cannot utilize modern availability zones, the Standard Load Balancer can cleanly span its backend pool boundaries across multiple distinct Availability Sets. This guarantees that backend compute nodes are separated across distinct underlying hardware update and fault domains.


Create a Cross-Region Load Balancer

To achieve geo-resilient high availability, you can deploy a Cross-Region Load Balancer. Operating as a globally distributed frontend layer, a cross-region load balancer uses Anycast routing to distribute incoming traffic across secondary, regionally deployed Standard Load Balancers. If an entire Azure region suffers a catastrophic infrastructure blackout, traffic instantly shifts to healthy regional instances in secondary geographies with zero packet loss.


[ Cross-Region Load Balancer (Global Anycast IP) ] > Regional LB (East US) > Backend VMs > Regional LB (West US) > Backend VMs 


Create a Gateway Load Balancer


Gateway Load Balancer (GWLB) is engineered to scale high-performance third-party Network Virtual Appliances (NVAs). It injects transparent inline network inspection and security optimization points straight into your data pathway. By utilizing VXLAN tunneling preservation protocols, a GWLB routes all inbound and outbound traffic to a pool of firewall NVAs before proxying clean data safely to its destination.


Integrate NAT Gateway with a Load Balancer

While a Standard Load Balancer can manage outbound connectivity, integrating an Azure NAT Gateway directly with your backend subnet provides the ultimate enterprise egress design. The NAT Gateway overrides the load balancer's egress path, providing scalable Source Network Address Translation (SNAT) port allocation and a dedicated outbound IP block to eliminate any risk of SNAT port exhaustion.


Using DDoS Protection for Load Balancer

To shield your public services from malicious volumetric floods, you can enable Azure DDoS Protection directly on the virtual network hosting your public load balancer. This provides continuous machine-learning-driven traffic scrubbing, isolating and mitigating malicious Layer 3 and Layer 4 denial-of-service attempts before they impact backend compute capacities.



Core Load Balancer Components

To configure and maintain an Azure Load Balancer, you must understand its core architectural building blocks:


[ Frontend IP Configuration ] > [ Load Balancing Rules ] > [ Backend Pools ] │ [ Health Probes ] 


                                      [ Health Probes ]


Basics & SKUs

Azure Load Balancer offers two primary structural SKUs:

  • Basic SKU: Retained for legacy applications. It features limited backend pool dimensions (restricted to 300 instances), lacks zone capabilities, is open by default, and exhibits slow diagnostic metrics.
  • Standard SKU: The foundational cloud standard. Supports up to 1,000 backend instances, offers zone redundancy, is secure by default, and includes real-time telemetry pipelines.


Frontend IP Configuration

The network entry point of your load balancer. It assigns either a static public IP address asset or a private virtual network internal IP allocation where external client machines direct their initial application traffic requests.


Backend Pools

The group of computing resources serving the incoming traffic. Members can be referenced using their specific Network Interface Cards (NICs), virtual machine identifiers, or assigned directly via a dynamic pool of private IP addresses spanning an entire virtual network subnet.


Inbound Rules

Rule Type

Technical Definition

Operational Purpose

Load Balancing Rules

Maps a frontend IP and port combination to a specific backend pool, protocol, and target port.

Uniformly distributes client traffic requests across a group of healthy backend servers.

Inbound NAT Rules

Maps an explicit incoming external port on the frontend IP directly to a single server NIC.

Provides direct management lines (SSH/RDP) to individual instances without exposing the whole pool.



Outbound Rules

Explicitly configures Source Network Address Translation (SNAT) allocations for instances inside your backend pool. It defines exactly which public frontend IP assets are used to process outbound internet queries, giving you explicit control over outbound connectivity.


Zone Redundancy

With the Standard SKU, your frontend IP can be configured as Zone-Redundant. This means Microsoft automatically announces the frontend IP across multiple availability zones simultaneously. If a physical zone collapses, the IP remains reachable as traffic instantly traverses alternative healthy zones to hit surviving backend pools.



Inbound Connectivity Mechanics


Inbound NAT Rules

Inbound NAT rules process traffic based on explicit destination mapping. You can configure individual rules manually, or define Inbound NAT Pools that automatically calculate port ranges for whole groups of virtual machines inside a scale set, abstracting administrative overhead.


High Availability (HA) Ports

Configurable only on Internal Load Balancers, an HA Ports Rule instructs the engine to load balance all TCP and UDP traffic arriving across every single port simultaneously (Port: 0). This is a critical prerequisite when building backend pools composed of third-party security firewalls or routing appliances that must inspect all inbound application protocols concurrently.


Multiple Frontends & Floating IPs (Direct Server Return)

Azure Load Balancer supports the orchestration of Multiple Frontend IP Configurations. This lets you host multi-tenant services or run SQL Server AlwaysOn Availability Groups on a shared load-balancing asset.


By enabling Floating IP (Direct Server Return / DSR) on your load-balancing rule, the backend virtual machine is configured to use the frontend IP address asset directly on its own local loopback network interface. When the load balancer proxies the packet, it preserves the original destination IP address. The backend VM can then reply directly back to the client browser, bypassing the load balancer on the return leg to eliminate asymmetric routing bottleneck constraints.


TCP Reset

Standard load balancing rules feature built-in TCP Reset on Idle configurations. If an established TCP application session remains inactive past the defined idle timeout limit (configurable between 4 to 30 minutes), the load balancer sends a TCP Reset packet (RST) to both the client and backend host. This terminates the dead socket immediately and frees up local server connection memory space.



Outbound Connectivity Mechanics


Outbound Connections

When a backend VM without an explicit public IP address initiates an outbound connection to the internet, Azure must perform Source Network Address Translation (SNAT) to map the VM's private IP to a public routing asset. Standard Load Balancers manage this via outbound rules, pre-allocating chunks of SNAT ports to each individual instance inside the backend pool.


🛑 Default Outbound Access Retirement Notice

Historically, Azure provided implicit outbound connectivity for VMs without explicitly assigned egress configurations. However, Default Outbound Access will be fully retired on September 30, 2025. Any backend compute instance lacking a defined Outbound Rule, an explicit Public IP on its NIC, or an integrated Azure NAT Gateway will lose all outbound internet connectivity.


Outbound-Only Load Balancer (Egress Only)

To secure highly confidential internal systems (such as financial database records), you can construct an Outbound-Only Public Load Balancer. By configuring a public frontend IP and binding it exclusively to an Outbound Rule without establishing any inbound load balancing or NAT rules, backend instances can securely download patches, fetch updates, or execute third-party API calls while remaining completely invisible and unreachable to inbound internet scans.



Monitoring & Observability


Insights

Azure Load Balancer Insights provides a pre-configured, interactive diagnostic dashboard built directly into the Azure Portal. It displays functional health topology maps, visualizes backend pool status, tracks data path availability, and highlights configuration anomalies automatically.


Diagnostic Settings

To capture long-term telemetry for compliance, you must configure Diagnostic Settings. These rule blocks orchestrate the real-time export of operational audit parameters, security insights, and health indicators directly to a secure Storage Account, an Event Hub pipeline, or a Log Analytics workspace.


Metrics & Alerts


  • Metrics: Standard Load Balancer pushes continuous high-fidelity counters straight to Azure Monitor. Critical dimensions to track include:
    • VIP Availability (Data path availability of your frontend IP).
    • DIP Availability (Health status of individual backend instances).
    • SNAT Connection Count (Tracks allocated vs. used SNAT ports to catch port exhaustion).
    • Byte Count & Packet Count (Tracks raw volume throughput processing).
  • Alerts: You can build automated alerting logic over these metrics. For instance, you can trigger PagerDuty alerts or execute webhooks to spin up more VMs if DIP Availability drops below a specific threshold or if your SNAT port allocation capacity spikes past 80%.


Log Analytics Workspace


By streaming logs into a Log Analytics Workspace, network engineers can utilize the Kusto Query Language (KQL) to query the M someNetworkDiagnostic operational tables. You can write custom queries to analyze health probe failures, audit exact timestamps when a backend server was marked unhealthy, or review SNAT connection lifecycles to maintain full visibility over your network infrastructure.


Previous Post Next Post

Contact Form