Network Clustering with RedHat's Piranha
Clustering usually brings to mind dozens of servers working together to crack the latest unbreakable encryption, or to break the world's record for calculating the most significant digits for pi. Although clustering plays a very important role in academia and research, it can also be important in enterprise computing, and especially in enterprise network computing, where guaranteed bandwidth does not guarantee the quickest responses for network applications.
Network clustering shares a load among servers in a predetermined method. The load distribution is unknown to the user's application and, as far as the user is concerned, he is communicating with one server. With the cluster, however, the user will be sharing the server with fewer users, therefore improving the user's experience with the site. This technology has become very popular for Web sites, where Web developers and marketing departments try to push the bottleneck of the Web experience as close to the user as possible while also avoiding congestion.
In the simplest form, a network cluster is a load balancer, as shown in the Figure " Load Balancer," which directs each packet in a round-robin fashion to each of the servers in the cluster. This type of clustering is easy to implement and is easy to support. However, a load balancer relies on lower level protocols for state information. With protocols like http,where a user session is actually multiple http requests, such a load balancer cannot support dynamic content. Additionally, the secure socket layer, which is used to encrypt traffic (such as credit card transactions), requires a key exchange with the destination that cannot be shared with other servers.
To support this requirement, the load balancer needs to maintain a state table to ensure that specific users' sessions are always sent to the same server. This support is often called "sticky" session support by cluster software developers since each session sticks to the same server. (Support of sticky sessions brings its own design considerations, such as further skewing of round robin's less than ideal balancing.)
Additionally, a simple load balancer does not provide the resiliency expected of a cluster solution. Although an algorithm (protocol) can be supported to determine when one of the servers fails, the load balancer itself is a single point of failure within the cluster.
Many current network clustering solutions are based upon Ethernet transport. Ethernet has proven itself as a LAN protocol and most enterprises have in house know-how and extensive experience with the technology.
Early clustering and some proprietary clustering solutions require a proprietary physical connection between the load balancer and the servers. The typical advantage of a proprietary connection is a quick recovery time in the case of failure; however, this is often overkill in intranet and Internet network solutions, except in those cases where systems are only allowed minutes per year of down time. Systems with proprietary physical connections also often use proprietary protocols that are closely integrated with their choice of physical networking.
The load balancer must not connect with the servers in the cluster; it can also be a logical system as opposed to a physical system. For example, all the nodes in the cluster may themselves act as the balancer, using proprietary protocols or creative designs such as virtual layer-2 or layer-3 addresses using open protocols.
Network clusters, unlike their larger application clusters, often do not have support for additional distributed system support, such as a distributed database. This often means database support, when needed, must be designed specifically for the solution. For example, updates to a database could be made to a central server via a network connection, or a local database could be updated with a batch process to replicate changes to take place at a scheduled time. (Security is beyond the scope of this article, but security design should play a large role in determining how the data should be replicated, especially on any systems connected to the Internet.)
To ensure reliability, the nodes in the cluster must communicate their statuses to each other. The method used can either be a simple ping-like command to ensure network connectivity (a very weak method), or it can actually connect with the servers to ensure the particular application is answering. Finally, the load balancer could remotely connect to the nodes much like an administrator does, to ensure certain processes are running.
Nodes participating in a cluster (which all production systems should have) will have a means of restarting processes and notifying the administrator whenever there is a failure. Unlike a normal production server, multiple failures will ideally have the node removed from the cluster since a "flapping" can cause additional interrupts to the user as the user's host-initiated TCP stack tries to compensate with retransmissions due to lack of acknowledgements. This can seem like a slow responding server to the user, whereas an earlier failover is less likely not to be recognized by the user.
RedHat's Piranha (http://www.redhat.com/apps/community/ cs_piranha.html) is a network clustering solution based upon the Linux Virtual Server (LVS) project (http://www.Linux VirtualServer.org/) with an added GUI configuration tool. The GUI configuration tool is being phased out with an HTML front end, which is covered in this article. At a minimum, Piranha is the load balancer portion of a cluster solution, and the servers providing content can actually be based upon any network operating system. Of course, high availability implies the use of UNIX systems. Piranha will monitor specified tcp ports with a connection to determine whether the service is available. This may not accurately determine that the service is running, since the server port may not have been closed properly if the daemon crashed. Therefore, Piranha supports simple protocol interaction with send and expect strings to ensure the service is responding. (Piranha uses this feature on Web services automatically.)
The Figure "Piranha Cluster Example" shows the topology of a Piranha cluster.
The LVS terminology uses LVS routers to refer to the two servers that make up the load balancer. The servers that are providing content to the client are referred to as real servers. The LVS servers share a public IP address referred to as the virtual IP address (VIP) and a private IP address referred to as a NAT router address. Piranha will actually support a one LVS server configuration, which effectively provides the additional performance of having multiple real servers with load balancing among them, but it does not provide the redundancy and high availability often needed with a network cluster.
Of the two LVS routers, only one is active at a time. The one accepting requests is referred to as the primary LVS router and the non-active as the backup LVS router. Both LVS routers monitor the links between the two LVS routers by listening for the heartbeat sent out by the pulse daemon running on the other system. Should the backup LVS router not receive a heartbeat message within its configured time frame (specified as xxx in /etc/lvs.cf), it will bring up the VIP on the LVS node interface and start answering client requests and direct them to the correct server. Additionally, multiple ARP replies will be sent at this time.
Note that sending out ARP replies to force a router to expire a cached ARP entry and to use the new one will not work with all routers and hosts; therefore, with the current implementation of Piranha, tuning the ARP cache on the LVS servers' gateway to a small value may be necessary to ensure a quicker cut over to the new layer-2 MAC address. A cleaner solution is to provide a virtual MAC address that could be shared between the LVS servers; thus a failover could be immediate, since the gateway would always have the proper MAC address (virtual or shared address) and the current primary server would immediately be able to see traffic destined for the virtual MAC address.
As mentioned, the pulse daemon is responsible for the heartbeat between the two LVS routers. The nanny daemon runs on the primary LVS router and is responsible for monitoring the real servers to ensure they can accept requests. Should the primary server determine a real server is down, it will stop sending requests to that server.
Network Address Translation and Distribution Methods
As mentioned, the client is not aware that it is communicating with a cluster server, and functions as though it were communicating with the VIP shared between the LVS routers. The LVS routers then replace the destination VIP address with the IP address of the real server and they ship the packet out of the appropriate interface. This translation is referred to as Network Address Translation (NAT).
The LVS routers determine which real server to route the packet to based upon the methodology the administrator has chosen. Piranha currently supports four methods: round robin, least-connections, weighted least-connections, and weighted round robin.
Round robin, as mentioned, simply distributes the jobs equally among the real servers. With sticky session supports, the odds of an even distribution are not as likely.
With least-connections, the LVS router consults its IP virtual server (IPVS) table to determine which server has the fewest active connections and sends the request to that particular real server.
Weighted least-connections allows the administrator to send more connections to servers with fewer connections, but to also use an administrator-assigned weight to servers with higher capacity to determine the appropriate real server.
Weighted round robin will distribute more jobs to servers with higher capacity based upon an administrator-assigned weight.
Piranha requires RedHat version 6.1 or higher, which meets the LVS requirement of kernel version 2.2.13 or higher. LVS does not require the RedHat version of Linux, but Piranha has been developed for and by RedHat, so this article will focus on its installation and configuration.
The system will require enough disk space for RedHat Linux, as well as some common UNIX daemons like Sendmail and Apache. Although it is not required, ideally the two systems that make up the LVS routers should be identical. The two or more systems that are real servers should also be identical. With the decrease in costs for Intel systems, many organizations will tend to make all the systems (both LVS routers and real servers) the same hardware. Although the large disk drives typical of Internet servers are overkill for the LVS routers, a standard server across the cluster makes upgrades, patches, and failed hardware replacement much easier.
The specific requirements for the LVS routers are basically any processor that can run RedHat with large enough storage for the operating system and typical UNIX daemons, including Apache, and enough RAM to allow the system to run without too much swapping.
Before the hardware closet is opened and those 386s are turned into a Piranha cluster, keep in mind that the hardware used should provide high performance and availability. In many cases, a 386- or 486-class machine will not outpace a single-server class machine with network requests, even though the real server has to access its hard drive to serve content. The likelihood that a 386 or 486 can keep up with multiple servers is even less. Server-class machines are designed to avoid congestion-especially network and disk drive congestion. A good cluster will include at least two high-performance network servers designed for quick network access and routing. Therefore, in addition to a quick processor, quality Linux-supported network cards and lots of RAM are necessary to route the packets as quickly as possible without having to access the hard disks to do so. Quality servers will also provide upgrade paths for additional RAM and disk space.
Each LVS router will require two network interface cards. One will be connected to the network that "points" towards the Internet (the network egress), and the other connected to the one pointing towards the real servers. The cards will typically be Ethernet and ideally 100-Mbit full duplex. The best industry Ethernet practices for optimized, low-latency, low-density networks should be used to connect the LVS routers to both networks. (Both sides of the LVS routers will typically have very few nodes and high-performance Ethernet switches to avoid causing a bottleneck in performance.)
Depending on the security model in place, a redesign or upgrade may be necessary to avoid additional delay at the firewall. The basic idea is to ensure that everything is designed so that the Internet edge router is the bottleneck between the Internet and the real servers.
The installation procedures covered herein are based upon a new install of RedHat 6.2. The configuration will include setting up Piranha on the two LVS routers and configuring for round robin. Three additional servers running Web services on port 80 are used, but their configuration is not covered. We will be using round robin load balancing, so the operating system and Web servers are not relevant for the configuration examples, as long as the Web servers are configured and running.
When installing RedHat, choose the method you prefer, but be sure to include the following packages. A custom install will ensure you get all these packages on the first install.
X Window System
The X Window and KDE are not actually required, but installing them will allow you to easily manage Piranha locally, which is more important when learning the system than when running a production system.
By installing the clustering package, you are installing the piranha-0.4.12-1.i386.rpm, piranha-docs-0.4.12-1.i386.rpm, and piranha-gui-0.4.12-1.i386.rpm packages, which make up the complete Piranha distribution.
Be sure to use the same installation options on both LVS routers to ease administration and troubleshooting, should problems arise.
While the files are being copied from the CDROM or FTP server, you can investigate and determine the appropriate IP addresses, netmasks, and gateways. The 10.1.1.0/24 network will be used for our internal network and 192.168.1.0/24 for the external network, as shown in the Piranha Cluster Example. The instructions in this article will use the addresses within these networks, so I'll briefly cover them:
Network 192.168.1.0/24 = Public / external network.
Network 10.1.1.0/24 = Server Network.
192.168.1.1 = IP address of the Ethernet interface eth1 on "lefty."
192.168.1.2 = IP address of the Ethernet interface eth1 on "righty."
192.168.1.3 = IP address of Internet router.
192.168.1.15 = Virtual IP Address (VIP).
This address is shared among the LVS routers and is the destination address of all external traffic that wishes to communicate with our public services.
10.1.1.1 = IP address of the Ethernet interface eth0 on "lefty."
10.1.1.2 = IP address of the Ethernet interface eth0 on "righty."
10.1.1.10 = IP address of "number1" real server.
10.1.1.11 = IP address of "number2" real server.
10.1.1.12 = IP address of "number3" real server.
10.1.1.15 = NAT Router IP.
This virtual address is shared among the LVS routers on the server network and is the destination address for replies from the real servers.
During the installation routine, configure the network interface cards on both LVS routers and complete the installation. Assuming all is up and running properly, all interfaces should respond to pings. Assuming routing is properly configured, and if the security policy allows it, you should be able to ping the external interface. (If the 192.168.1.0/24 addresses are used, as in our example, they will not be reachable via the Internet because they are addresses defined in RFC 1918, and therefore not routed on the Internet.)
After the installation is complete and the system has rebooted, the system needs to be updated to the current version of Piranha in order to fix a security bug with version 0.4.12-1 distributed with RedHat 6.2 (see http://www.redhat.com/support/errata/ RHSA-2000014-16.html).
Download the following files from ftp://updates.redhat.com/6.2/i386/ or one of the RedHat mirror sites:
The aforementioned files should be downloaded before installing version 0.4.14-1 of Piranha just downloaded, as user root we'll remove the old version:
rpm -e piranha-docs-0.4.12-1
rpm -e piranha-gui-0.4.12-1
rpm -e piranha-0.4.12-1
and install the new version:
rpm -ivh piranha-docs-0.4.14-1.i386.rpm
rpm -ivh piranha-0.4.14-1.i386.rpm
rpm -ivh piranha-gui-0.4.14-1.i386.rpm
Activate routing on both hosts by editing the /etc/sysconfig/network file and changing the FORWARD_IPV4="no" to FORWARD_IPV4="yes".
Once the install is complete, take a look at the Piranha documentation located in /usr/docs/piranha-docs-0.4.14. Pay special attention to the README for any changes. There is also a howto.html and a sample.cf configuration file worth looking over.
Before administering the LVS routers, you need to set the Piranha administrator password on both of the LVS routers with:
where password is the password assigned to the Web user piranha that will be used to manage the Piranha cluster via a Web interface.
You can now manage the LVS routers with a Web browser. To configure lefty, point the Web browser to http://10.1.1.1/piranha/. If name services is configured, the name of the host can be used.
After entering the user name piranha and the password you provided above, you will see the screen as shown in Figure "Piranha Configuration Tool."
The Configuration Tool is made up of four areas: control/monitoring (which is also the welcome screen), global settings, redundancy, and virtual services. Each of the four screens will be covered as we go through the configuration.
The control/monitoring section shows the status of the LVS daemon, and whether the screen should refresh, and if so, in how many seconds. While learning the system, an auto update is usually not necessary because you'll typically hit reload when you wish to refresh. If a refresh is necessary, click on the auto update button and type in the number of seconds in which it should refresh. The control/monitoring screen will also show the LVS routing table and LVS processes running.
Now click on Global Settings in the Control / Monitoring screen. Ensure that the Primary VLS server IP is set to 192.168.1.1, which is the public address of lefty. Ensure the LVS type is lvs, not fos.
Change the network type to NAT. Using NAT, Piranha will change the destination address to the appropriate real server and route the packets to the server. By activating NAT, Piranha will prompt you to enter the NAT Router IP, which is the virtual IP address shared between the LVS routers on the private network. Enter the NAT Router IP address: 10.1.1.15. Piranha also requires the interface to listen for requests to the NAT router IP address; enter eth0:1.
Next, choose the sync tool to be used by Piranha. This determines the method that the Piranha LVS servers use to synchronize their lvs.cf configuration files. rsh is supported directly by Linux and is fine for testing Piranha; however, the rshd has had its share of security problems in the past. (To be fair, many problems are mis-configured systems.) Therefore, ssh is a better choice in many environments. For simplicity, choose rsh.
The Global Settings with the appropriate entries are shown in the Figure "Global Settings."
Click on Accept to activate the changes.
At this point, activate redundancy by clicking on the REDUNDANCY link. Click on Enable, which will cause Piranha to prompt for the Redundant server. Enter the external address of righty: 192.168.1.2. The heartbeat, dead after, and heartbeat port can be left at 6 seconds, 18 seconds, and 539. The REDUNDANCY settings are shown in the Figure "Redundancy Settings."
Click on Accept to activate the changes.
Finally, the virtual servers can be configured using the VIRTUAL SERVERS link. Virtual servers are the services the LVS routers should listen for and route to an appropriate real server. Without the appropriate virtual server, the connection will never be translated to the real server's address and forwarded.
There will be an existing entry that can be edited using the edit button. Enter the name of the server, which can be the host name or other internal name for Piranha use. The application port will be the port on which the server runs, and for Web access we'll use the standard port 80. Enter the VIP: 192.168.1.15. The device must also be entered. This will be an ethernet alias, so we'll use eth1:1 Change the scheduling to round robin. We do not need persistence for this tutorial, but for "sticky" connection, enter the number of seconds the connection should "stick" to the particular real server to which it is assigned. For example, 360 seconds.
The settings for the VIRTUAL SERVERS => VIRTUAL SERVER are shown in the "Virtual Server Settings" Figure. Click Accept.
After the LVS router lefty is configured to be a virtual server, the real servers that will accept these requests must be entered. Click on the REAL SERVER link within the VIRTUAL SERVERS screen. Click Edit to edit the default first entry.
In the name field, type number1. In the address field, type 10.1.1.10. The weight can be left at one, since round robin is being used. ( Figure "Real Server Entry.") Click on Accept. Create entries for number2 (10.1.1.11) and number3 (10.1.1.12). Click on REAL SERVER and activate each of the real servers by clicking on the activate button with the radio button selected with each entry.
Once Piranha recognizes the real servers by indicating "up" in the status field, the virtual server can be activated. Click on the Virtual Servers (not Virtual Server), and click on the (DE)ACTIVATE button. The Virtual Sever status will change to "up."
At this point righty needs to be configured. Many new Piranha users are tempted to reenter the information on righty; however, this is prone to user error. Simply copy lefty's /etc/lvs.cf file to righty. Cut and paste is the easiest way the first time through.
We activated rsh as the method Piranha uses to synchronize the files; however, the systems cannot yet use rsh and should probably be left this way. The files can always be manually synchronized when changes are made until ssh is installed.
Now the LVS routers can be started properly. On both lefty and righty, enter:
From outside the network or on the external network, generate traffic to the virtual server by pointing a browser at http://192.168.1.15/. You should get the index.html page. Monitor the log files of the real servers to ensure that the round robin is working. (On a Linux server with Apache, the logfile is /var/log/httpd)
When configuring Piranha, it may appear that the box has simply crashed; however, this is often caused by forgetting to include the virtual interface for an interface field, for example, entering eth0 when eth0:1 is the correct entry. When the lvs daemon kicks in, the interface will suddenly have the virtual interface. Often you can then telnet to the new IP address, assuming the correct address was entered.
Use the ifconfig command when in doubt. A sniffer will also come in handy to allow the traffic to be seen and to ensure that NAT is working properly. If getting multiple servers to work together seems too difficult, start off with one LVS router and one real server. Ensure that both systems function in a normal environment. The systems should respond to pings, network daemons should be answering requests, and everything must be in order before placing the systems in an area of the network that can be tough, because since you are dealing with at least two IP addresses and two virtual IP addresses (VIP and NAT), as well as NAT and a new product.
This article has given an overview of RedHat's Piranha network clustering solution. Although we've only covered a NAT-based, round robin solution, the information provided will get you started in the right direction with Linux clustering and provide the basis for getting hands-on experience with clustering. Piranha's Web-based front end makes the experience much easier than working with the configuration files. Enjoy.
Ronald McCarty received his bachelor's degree in Computer and Information Systems at the University of Maryland's international campus at Schwaebisch Gmuend, Germany. After completing his degree, Ronald McCarty started his network career as network administrator at the Schwaebisch Gmuend campus. Ronald McCarty works for Lucent Technologies as a senior systems engineer on a customer team responsible for a major telecommunications carrier. He spends his free time with his two best friends in the world: his daughter, Janice, and his wife, Claudia. Ron can be reached at email@example.com