Load balancing is a way to manage which of your servers receive traffic. Load balancing helps distribute incoming client connections over various endpoints (for example, Client Access servers) to ensure that no one endpoint takes on a disproportional share of the load. Load balancing can also provide failover redundancy in case one or more endpoints fails. By using load balancing with Exchange Server 2013, you ensure that your users continue to receive Exchange service in case of a computer failure. Load balancing also enables your deployment to handle more traffic than one server can process while offering a single host name for your clients.
Load balancing serves two primary purposes. It reduces the impact of a single Client Access server failure within one of your Active Directory sites. Also, load balancing ensures that the load on each of your Client Access servers is evenly distributed.
In Exchange Server 2010, client connections and processing were handled by the Client Access server role. This functionality required both external and internal Outlook connections, and mobile device and third-party client connections, be load balanced across the array of Client Access servers in a deployment to achieve fault tolerance and efficient utilization of servers. Many Exchange 2010 Client Access protocols required affinity: a relationship between the client and a particular Client Access server. In particular, Outlook Web App, the Exchange Control Panel, Exchange Web Services, Outlook Anywhere, Outlook TCP/IP MAPI connections, Exchange ActiveSync, the Exchange Address Book Service, and remote PowerShell either required or benefited from client-to-Client Access server affinity. Load balancing options in Exchange 2010 included the following features:
Because of the differing needs of client protocols in Exchange 2010, we recommended using a Layer 7 load balancing solution. Layer 7, also known as application-level load balancing, allowed the load balancing solution to use complex rules to determine how to balance each request entering the system, given that the entire conversation between client and server would be available to the load balancer logic. These complex rules ensured all requests from a specific client went to the same Client Access server endpoint. In Exchange 2010, if all requests from a specific client didn't go to the same endpoint for protocols that required affinity, the user experience would be negatively affected. For more information about Exchange 2010 load balancing options, see Understanding Load Balancing in Exchange 2010.
With session affinity and Layer 7 load balancing, all requests between the client and the server are sent to the same endpoint, as required by various protocols. Requests are distributed at the application layer. With Layer 4 load balancing, the requests are distributed at the transport layer. The load balancing solution distributes requests from the client, which is aware of a single IP address (sometimes called the virtual IP address or VIP), to a set of servers that perform the work. The connection between the client and server must be established before the content of the request is determined, so the load balancer selects a server to receive the request before examining the content of the request. The selection of the target server can be made in various ways such as \"round-robin,\" in which each inbound connection goes to the next target server in a circular list, or \"least connections,\" in which the load balancer sends each new connection to the server that has the fewest established connections at that time. Now that session affinity isn't required, you have more flexibility, choice, and simplicity with respect to the load balancing architecture you deploy. Load balancing without session affinity lets you increase the capacity and utilization of the load balancer, because processing isn't used to maintain more involved affinity options such as cookie-based load balancing or Secure Sockets Layer (SSL) session ID.
In Exchange 2010, we introduced the concept of a Client Access array. After a Client Access array was configured for an Active Directory site, all Client Access servers in the site automatically became members of the array. In current builds of Exchange 2013, no configuration of a Client Access array is required, because the deployment of a load balanced and highly available service is much simpler.
The use of hardware load balancers is still supported for Exchange 2013. For information about the hardware load balancing solutions that have completed solution testing with Exchange 2010 and will likely work as well with Exchange 2013, see Exchange Server 2010 load balancer deployment. Keep in mind that this page shows the more complex Layer-7 configuration of hardware load balancers with Exchange 2010. Load balancing Exchange 2013 traffic can be much simpler, given the architectural changes discussed earlier in this topic. Rather than configuring session affinity for each of the Exchange protocols, inbound connections to Exchange 2013 Client Access Servers can be directed to an available server by the load balancer with no further affinity processing necessary. The hardware load balancer still has an important role in providing high availability of the Exchange service because it can detect when a specific Client Access server has become unavailable and remove it from the set of servers that will handle inbound connections.
WNLB doesn't detect service outages. WNLB only detects server outages by IP address. This means that if a particular web service, such as Outlook Web App, fails, but the server is still functioning, WNLB won't detect the failure and will still route requests to that Client Access server. Manual intervention is required to remove the Client Access server experiencing the outage from the load balancing pool.
Load balancing in Exchange 2016 and later build on the Microsoft high availability and network resiliency platform delivered in Exchange 2013. When this is combined with the availability of third-party load-balancing solutions (both hardware and software), there are multiple options for implementing load balancing in your Exchange organization.
With the HTTP protocol in use, all native clients connect using HTTP and HTTPs in Exchange Server. This standard protocol removes the need for affinity, which was previously required to avoid a new prompting for user credentials whenever load balancing redirected the connection to a different server.
HTTP makes possible the use of service or application health checks in your Exchange network. Depending on your load balancer solution, you can implement health probes to check different components of your system.
The effect of HTTP-only access for clients is that load balancing is simpler, too. If you wanted, you could use DNS to load balance your Exchange traffic. You would provide the client with the IP address of every Mailbox server, and the HTTP client would handle the chores. If an Exchange server fails, the protocol attempts to connect to another server. However, there are drawbacks to load balancing to DNS, discussed in the following section Load balancing options in Exchange Server.
In the example shown here, multiple servers configured in a database availability group (DAG) host the Mailbox servers running Client Access services. This provides high availability with a small Exchange server footprint. The client connects to the load balancer rather than directly to the Exchange servers. There's no requirement for load balancer pairs, however we recommend deploying in clusters to improve network resilience.
Using DNS is the simplest option for load balancing your Exchange traffic. With DNS load balancing, you only have to provide your clients with the IP address of every Mailbox server. After that, DNS round robin distributes that traffic to your Mailbox servers. The HTTP client is smart enough to connect to another server should one Exchange server fail completely.
Simplicity comes at a price, however, in this case, DNS round robin isn't truly load-balancing the traffic, because there isn't a way programmatically to make sure that each server gets a fair share of the traffic. Also, there is no service level monitoring so that when a single service fails, clients aren't automatically redirected to an available service. For example, if Outlook on the web is in failure mode, the clients see an error page.
There are more elegant solutions to load balancing your traffic, such as hardware that uses Transport Layer 4 or Application Layer 7 to help distribute client traffic. Load balancers monitor each Exchange client-facing service, and if there is a service failure, load balancers can direct traffic to another server and take the problem server offline. Additionally, some level of load distribution makes sure that no single Mailbox server is proxying the majority of client access.
Because they don't examine the traffic contents, Layer 4 load balancers save time in transit. However, this comes with trade-offs. Layer 4 load balancers know only the IP address, protocol, and TCP port. Knowing only a single IP address, the load balancer can monitor only a single service.
Layer 7 load balancers forego the raw performance benefits of Layer 4 load balancing for the simplicity of a single namespace, for example, mail.contoso.com, and per-service monitoring. Layer 7 load balancers understand the HTTP path, such as /owa or /Microsoft-Server-ActiveSync, or /mapi, and can direct traffic to working servers based on monitoring data.
Exchange 2016 introduced significant flexibility for your namespace and load-balancing architecture. With many options for deploying load balancing in your Exchange organization, from simple DNS to sophisticated third-party Layer 4 and Layer 7 solution, we recommend that you review them all in light of your organization's needs.