Analysis of VRRPv2 - Issues and Solutions

By Larry Pingree

Personal Website: http://www.geek-guy.com

Consulting Services: http://www.siegeworks.com
 
 

Please send any technical corrections to geekguy@justgeek.com


 

VRRP also called "Virtual Routing Redundancy Protocol" is a fairly new technology developed by members of the Internet Engineering Task Force in 1998. This paper will describe the downfalls of VRRPv2 as defined on the Nokia platform, and how VRRP Monitored Circuit fixes these issues. I'll also talk about how VRRP works so that you can better understand the technology.

The Request for Comment regarding VRRP is RFC-2338.
 

The Goal of VRRPv2

    The goal with VRRP was essentially slide an IP address from one gateway machine to another in the event of a hardware failure. This allows us to achieve a much better uptime than we would have had without this technology. Imagine that you have well over 150 PC's functioning within a network, and they are all have a gateway address of one router, and that one fails, well, the only thing you could have possibly done in the past is replace the router or the failed network interface or changed every machines gateway to a different router. NOT! Lets get serious now, even if you have a repairman that can come onsite within two hours with replacements, are the replacements available right then? Not always, so in comes VRRP.
 

A VRRP Packet

Lets start with the basics.    Essentially what VRRP does is announce its existence over Multicast 224.0.0.18 saying "Hello" to the other routers within its broadcast domain. This announcement occurs every 1 second. If we look inside this packet we see the following:

11:38:59.000196 192.168.1.1 > 224.0.0.18: VRRPv2-adver 20: vrid 2 pri 255

Notice that the 192.168.1.1 box is announcing that it is running VRRPv2 and that this is an advertisement packet. The next thing we see within the packet, is that we are advertising a VRID. A VRID is used to identify participating nodes within a VRRP configuration, we'll talk more about this later as this changes a bit from VRRPv2 to Monitored Circuits.
 

Priority

The next thing we see is that this machine's priority is set to 255. In VRRP, the priority functions as an election mechanism. The priority value for the VRRP router that owns the IP address(es) associated with the virtual router MUST be 255 (decimal). VRRPv2 routers backing up a virtual router MUST use priority values between 1-254 (decimal). The default priority value for VRRP routers backing up a virtual router is 100 (decimal). The priority value zero (0) has special meaning indicating that the current Master has stopped participating in VRRP. This is used to trigger Backup routers to quickly transition to Master without having to wait for the current Master to time-out.
 

Time to Live

The TTL of the packet MUST be set to 255. A VRRP router receiving a packet with the TTL not equal to 255 MUST discard the packet.
 

The Protocol Number

The IP protocol number assigned by the IANA for VRRP is 112 (decimal).
 

My Configuration is as follows:



HOST A

IP address 192.168.254.1

VRID 1

Priority (Default)

Backing Up VRID 2 with address 192.168.254.50
 

HOST B

IP address 192.168.254.50

VRID 2

Priority (Default)

Backing Up VRID 1 with address 192.168.254.1
 
 

Lets say that you've accomplished a complete setup of VRRP using the above configuration and you  want to know how this looks from the perspective of the packets transmitted on the wire.
 
 

Essentially you will have two machines that are announcing their IP addresses in which they own and the packets will look like this:

Host A's Communication will look like this:

12:16:20.040201 192.168.254.1 > 224.0.0.18: VRRPv2-adver 20: vrid 1 pri 255
12:16:20.420209 192.168.254.1 > 224.0.0.18: VRRPv2-adver 20: vrid 2 pri 100

Host B's Communication will look like this:

12:16:20.040201 192.168.254.50 > 224.0.0.18: VRRPv2-adver 20: vrid 2 pri 255
12:16:20.420209 192.168.254.50 > 224.0.0.18: VRRPv2-adver 20: vrid 1 pri 100

As you can see above, the primary address for VRID 1 is 192.168.254.1 and the primary address of VRID 2 is 192.168.254.50 and they are both backing up the addresses of each other. How can you tell you ask? Well, essentially, the router with address 192.168.254.1 says he is the master of his address by sending a priority of 255 for his own address. Please keep in mind that HOST B is watching this traffic and making sure that HOST A is sending this Priority 255.
 
 

Failure on Host A Occurs - OUCH!

While HOST B watches the wire, he sees that Host A is no longer announcing the priority 255, and since there are no other backup hosts for HOST A with a priority higher than 100, HOST B takes the address 192.168.254.1 and sends out a gratuitous ARP request that essentially asks all hosts to clear their ARP cache containing the IP address 192.168.254.1 thus, all the hosts on the subnet will now send their packets to HOST B. The way this looks on the Ethernet:

12:16:20.040201 192.168.254.50 > 224.0.0.18: VRRPv2-adver 20: vrid 2 pri 255
12:16:20.420209 192.168.254.50 > 224.0.0.18: VRRPv2-adver 20: vrid 1 pri 100

Well, it looks quite the same as before, but we are now missing the advertisement from HOST A.
 

Host A comes back up! - YAY!

Host A then begins by sending out his packets again similar to before:

12:16:20.040201 192.168.254.1 > 224.0.0.18: VRRPv2-adver 20: vrid 1 pri 255
12:16:20.420209 192.168.254.1 > 224.0.0.18: VRRPv2-adver 20: vrid 2 pri 100
 

That's it, VRRPv2. Now on to the issues with VRRPv2.
 
 

Issues with VRRPv2

Ok, so here is the scenario, we have HOST A and HOST B, lets say that both internal and external interfaces on each host are running VRRP with the goal of maximizing fail-over from the perspective of incoming and outgoing packets. Whoa, never thought it could be this complex eh? Just watch!

So now we have the following configuration:

HOST A

External Interface

IP address 10.1.1.1

VRID 1

Priority (Default)

Backing Up VRID 2 with address 10.1.1.50
 
 

Internal Interface

IP address 192.168.254.1

VRID 3

Priority (Default)

Backing Up VRID 3 with address 192.168.254.50
 
 

HOST B

External Interface

IP address 10.1.1.50

VRID 2

Priority (Default)

Backing Up VRID 1 with address 10.1.1.1
 
 

Internal Interface

IP address 192.168.254.50

VRID 4

Priority (Default)

Backing Up VRID 3 with address 192.168.254.1
 
 

Lets say that we have VRRP fully configured on our internal and external interfaces. Now lets try to fail the HOST A's External interface.
 
 

External Interface Failure HOST A


 
 

What has happens now is that the external VRRP address gets switched over to HOST B and now we see that machines on our internal wire that have a gateway of HOST A are now unable to route out the external interface of HOST A.  In comes OSPF.
 
 

Now the only solution is to implement OSPF that way the external route is lost on HOST A when the interface fails. Route selection based on cost will select to travel over the link between HOST A and HOST B, and this works fine using a Nokia as a  router. EXCEPT WHILE RUNNING A NAT FIREWALL!
 
 

The Problem  in NAT Environments:

The primary issue here is that a packet goes through HOST A's internal interface, routes over the so called "SYNC LINK" and then traverses HOST B, and leaves to the internet gateway. Well, lets take a closer look. Typically we would NAT our internal machines behind the Firewall's external interface Right? Well, if you do this packets would get routed through the SYNC link, and their source address would become the external IP address of HOST A, then get routed through the second Firewall and would then be sent out to the internet.
 
 

By the time the reply packet comes back (20 Milliseconds on Average), the HOST A will probably not have sent a SYNC update (which occurs every 50 Milliseconds) to HOST B, and HOST B would not have the correct NAT state table to translate the packet back to the original requester, therefore some connections cease to function and serious network delays result.
 
 

THE FIX! - Monitored Circuits



Next, came a new generation of VRRP called "VRRP MONITORED CIRCUITS" which essentially functions in two different way than VRRP v2. The three primary differences between the two types of VRRP:

1. Support for Completely Virtual Addresses

2. Support for one VRRP interface to monitor another interface.

3. Support for a single VRID for all participating NODES.
 
 

Essentially, now we can configure our Network Like this:



HOST A

Interface (External) 10.1.1.1/24
Virtual Router: 1
Priority: 100
Hello Interval: 1
Backup IP: 10.1.1.25
Monitoring Interfaces: (Internal) Priority Delta: 10

eth-s1p2c0 (Internal) 192.168.254.1/24
Virtual Router: 2
Priority: 100
Hello Interval: 1
Backup IP: 192.168.254.25
Monitoring  Interfaces: (External) Priority Delta: 10
 

HOST B

Interface (External) 205.226.10.1/24
Virtual Router: 1
Priority: 95
Hello Interval: 1
Backup IP: 192.168.254.25
Monitor Interfaces: (Internal) Priority Delta: 10

Interface (Internal) 192.168.254.50/24
Virtual Router: 2
Priority: 95
Hello Interval: 1
Backup IP: 192.168.254.25
Monitor Interfaces: (External) Priority Delta: 10
 
 

Notice that on HOST A, we are saying that our actual address is 192.168.254.1 and our Virtual address is 192.168.254.25. This means that 192.168.254.25 is our VRRP address, this is the address that will fail from one machine to another. We are also saying that the external interface will be monitoring the internal interface, and that if that internal interface dies, since we are monitoring it, we will decrement the external interface priority by the delta of the monitored interface and our effective priority will now be 100 minus 25 which is equal to 75. Since HOST B's priority is 95, this makes him the higher priority and thus takes master for the external failed address, he also takes mater for the internal address since it has failed.
 

This added functionality of VRRP Monitored Circuit now means that we need no participation with a routing protocol to fail between machines and route correctly, this also eliminates the asymmetric routing of a single interface failure as in VRRP v2, since all virtual addresses have swapped over to HOST B when we had a failure.
 

Welp, that's it, and I hope this has helped some of you out there in never never land. Look for more of my publications in the future. Thanks!

Written By- Larry Pingree