A server cluster is a group of independent servers (usually in close proximity to one another) interconnected through a dedicated network to work as one centralized data processing resource. Clusters are capable of performing multiple complex instructions by distributing workload across all connected servers. Clustering improves the system’s availability to users, its aggregate performance, and overall tolerance to faults and component failure. A failed server is automatically shut down and its users are switched instantly to the other servers.
Categories of Clusters:
- Asymmetric Clusters. In asymmetric clusters, a standby server exists only to take over for another server in the event of failure. This type of cluster is usually used to provide high availability and scalability for read/write stores such as databases, messaging systems, and file and print services. If one of the nodes in a cluster becomes unavailable, due to either planned downtime for maintenance or unplanned downtime due to failure, another node takes over the function of the failed node.
- Symmetric Clusters. In symmetric clusters, every server in the cluster performs useful work. Typically, each server is the primary server for a particular set of applications. If one server fails, the remaining server continues to process its assigned set of applications as well as the applications on the failed server. Symmetric clusters are more cost-effective because they use more of the cluster’s resources more often; however, in the event of a failure, the additional load on the remaining servers could cause them to fail as well. One common type of symmetric cluster is a load-balanced.
High Availability
High availability means that your application will be available, without interruption. In the context of application clustering, it means that any given node (or combination of nodes) can be shut down, blown up, or simply disconnected from the network unexpectedly, and the rest of the cluster will continue operating cleanly as long as at least one node remains. It requires that nodes can be upgraded individually while the rest of the cluster operates, and that no disruption will result when a node rejoins the cluster. It typically also requires that nodes be installed in geographically separate locations. This type of clustering avoids loss of service to the users or applications that access the cluster and can occur transparently, without the users’ knowledge.
Not every application can run in a high-availability cluster environment, and the necessary design decisions need to be made early in the software design phase. In order to run in a high-availability cluster environment, an application must satisfy at least the following technical requirements, the last two of which are critical to its reliable function in a cluster, and are the most difficult to satisfy fully:
- There must be a relatively easy way to start, stop, force-stop, and check the status of the application. In practical terms, this means the application must have a command line interface or scripts to control the application, including support for multiple instances of the application.
- The application must be able to use shared storage (NAS/SAN).
- Most importantly, the application must store as much of its state on non-volatile shared storage as possible. Equally important is the ability to restart on another node at the last state before failure using the saved state from the shared storage.
- The application must not corrupt data if it crashes, or restarts from the saved state.
Scalability
Clustering is also used to enhance scalability. Server clusters can support more users at the current level of performance or improve application performance for the current number of users by sharing the workload across multiple servers. A byproduct of clustering servers for scalability is that the additional redundancy of the multiple servers helps increase system availability.
Server Affinity
Clustering uses serveraffinityto ensure that applications requiring the user interact with the same server during a session get to the right server. This is most often used in applications executing a process, for example order entry, in which the session is used between requests (pages) to store information that will be used to conclude a transaction, for example a shopping cart.
Benefits and liabilities of Server Clustering
Benefits:
Improved scalability. Server Clustering enables applications to handle more load.
Higher availability. Server Clustering helps applications avoid interruptions in service.
Greater flexibility. The ability of clustering to present a virtual unified computing resource provides IT personnel with more options for configuring the infrastructure to support application performance, availability, and scalability requirements.
Liabilities:
Increased infrastructure complexity. Some clustering designs significantly increase the complexity of your solution, which may affect operational and support requirements. For example, clustering can increase the numbers of servers to manage, storage devices to maintain, and network connections to configure and monitor.
- Additional design and code requirements. Applications may require specific design and coding changes to function properly when used in an infrastructure that uses clustering. For example, the need to manage session state can become more difficult across multiple servers and could require coding changes to accommodate maintaining state so that session information is not lost if a server fails.
Incompatibility. An existing application or application component may not be able to support clustering technologies. For example, a limitation in the technology used to develop the application or component may not support clustering even through code changes.
Load Balancing Overview
Definition
Load balancing is a computer networking method for distributing workloads across multiple computing resources, such as computers, a computer cluster, network links, central processing units or disk drives. Load balancing aims to optimize resource use, maximize throughput, minimize response time, and avoid overload of any one of the resources. Using multiple components with load balancing instead of a single component may increase reliability through redundancy.
Load balancing can happen without clustering when we have multiple independent servers that have same setup, but other than that, are unaware of each other. Then, we can use a load balancer to forward requests to either one server or other, but one server does not use the other server’s resources. Also, one resource does not share its state with other resources.
Each load balancer basically does following tasks:
- Continuously check which servers are up.
- When a new request is received, send it to one of the servers as per the load balancing policy.
- When a request is received for a user who already has a session, send the user to the same server (This part is important, as otherwise user would keep going between different servers, but not able to really do any work). This part is not required for serving static pages, in that case, there are no user session.
Algorithms
Load balancers use different algorithms to control traffic. The goal of these algorithms is to intelligently distribute load and/or maximize the utilization of all servers within the cluster. Some examples of these algorithms include:
Round-robin. A round-robin algorithm distributes the load equally to each server, regardless of the current number of connections or the response time. Round-robin is suitable when the servers in the cluster have equal processing capabilities; otherwise, some servers may receive more requests than they can process while others are using only part of their resources.
Weighted round-robin. A weighted round-robin algorithm accounts for the different processing capabilities of each server. Administrators manually assign a performance weight to each server, and a scheduling sequence is automatically generated according to the server weight. Requests are then directed to the different servers according to a round-robin scheduling sequence.
Least-connection. A least-connection algorithm sends requests to servers in a cluster, based on which server is currently serving the fewest connections.
Load-based. A load-based algorithm sends requests to servers in a cluster, based on which server currently has the lowest load.
Benefits and liabilities of Load balancing
Benefits:
Improved scalability. Scalable load-balanced tiers enable the system to maintain acceptable performance levels while enhancing availability.
Higher availability. Load balancing enables you to take a server offline for maintenance without loss of application availability.
Potential cost savings. Multiple low-cost servers often provide a cost savings over higher-cost multiprocessor systems.
Liabilities:
Development complexity. A load-balanced solution can be difficult to develop if the solution must maintain state for individual transactions or users.
Doesn’t account for network failure. If a server or network failure occurs during a client session, a new logon may be required to re-authenticate the client and to reestablish session state.
Conclusion
Clustering saves the user’s state, and is more transparent to the user, but is harder to setup, and is very resource specific. Different application servers have different clustering protocols, and don’t necessarily work out of the box. Load balancing is comparatively more painless, and relatively more independent of application servers.
From a user’s perspective, it means that if the user is doing something on the application, and that server goes down, then depending upon whether the system is doing clustering or load balancing, the user observes different behavior. If the system is clustered, the user may be able to continue doing the transaction, and may not even realize that the server has gone down. If the system is load balanced without clustering, that means that the user’s state will likely be lost, and the user will be simply sent to the other server(s) to restart transaction.
Server Cluster and Load Balancing