Thursday, September 29, 2011

BFD - Funny business on Nexus

Imagine an OSPF broadcast network with adjacencies built between 3 routers.  Logically, it looks like this:
Logical Topology

Now, I'm no big fan of OSPF broadcast networks, and this is why:
  • On a point-to-point OSPF interface, neighbor failure can often be detected based on interface state.  If the interface goes down, then you know the neighbor is dead.
  • If there are only 2 neighbors on a broadcast segment, you might as well use point-to-point mode.  Adjacencies build faster (no DR/BDR election) and the LSDB doesn't get needlessly cluttered with type 2 LSAs.  Also, the node map used by Dijkstra's algorithm is simpler, meaning you're less likely to run into silly nonsense like this.
  • With more than 2 neighbors on a broadcast segment, you can never rely on interface state as an indicator of the availability of an individual neighbor.
So, in the network I've described above, a node failure will go undetected for 30-40 seconds with default timers.

I thought that I'd use Bidirectional Forwarding Detection (BFD) to speed up convergence in this network.  BFD is a protocol agnostic tool that tests forwarding capability/availability of neighboring routers.  Routing protocols (like OSPF) request that BFD test the availability of its neighbors.  If the neighbor fails, BFD tells the client (OSPF) about the problem and the client reacts accordingly.  OSPF will remove the adjacency and reconverge around the failed router.

I arranged a test using 3 Nexus 7000s.  The logical configuration looked like the picture above, and the physical configuration looked like this:
Physical Topology for BFD Test
Everything worked great.  Each Nexus established a BFD session to monitor the availability of each neighbor.

Then I reconfigured the topology to look like this:
BFD session not working through L2/L3 switch?
All three switches are connected in a line.  VLAN 10 is forwarding on both links, and all three switches are OSPF neighbors.  BFD on each switch knows it's supposed to have 2 neighbor sessions up and running, but only B sees two BFD neighbors.  A can't see C, and C can't see B.  Weird.

I connected the thre switches in a triangle with STP blocking one of the links.  As I moved the STP root (and thus the blocking link) around, the switches that failed to establish BFD sessions moved as well.
But with the SVI disabled on switch B, A and C can suddenly see each other!
SVI shutdown - Missing BFD Session Lives!


This is strange stuff.  The BFD session between A and C should be getting L2 transit from switch B, but it doesn't work if switch B has an L3 interface active on this VLAN.

Monday, September 5, 2011

The solution to the OSPF misconfiguration challenge

The OSPF challenge in the previous post received much more traffic than most, but disappointingly few comments.

The solution follows below, in white-on-white text.  Highlight below if you've read the last post and want to see the answer.

I included lots of distracting configuration elements that have nothing to do with the problem:
  • Multiple OSPF areas
  • Different OSPF area types
  • ABR disagreements about stubiness
  • 'subnets' keyword on redistribution
  • Mismatched wildcard bits in OSPF network statements
  • OSPF process IDs
  • HSRP preemption stuff
  • Passive OSPF interfaces
The answer lies with the static routes and redistribution on R1 and R2.
When the Ethernet interface on either of these routers fails, then the static route to the server segment behind the firewall will be withdrawn, and the OSPF redistribution will go with it.  R3 and R4 will immediately converge on the remaining path to the server segment.
Shortly after, the static route will re-appear on the router with the failed interface, but now it will have recursed through R3 or R4.  Because the static route has re-appeared, the redistribution will re-appear as well.
R3 and R4 will use the path via the failed router without knowing that the path loops right back to them.
The fix here is to configure the static routes like this:
ip route 172.16.0.0 255.255.0.0 Ethernet 0/0 10.0.0.20
Using the interface keyword here locks the route to the Eth0/0 interface, preventing recursion.

The IOS documentation says this:
Specifying the next hop without specifying an interface when configuring a static route can
cause traffic to pass through an unintended interface if the default interface goes down.

In fact, while the static route as I originally presented it might seem like the "normal" way to do things, the IOS documentation explicitly refers to this sort of configuration as a "recursive static route":

In a recursive static route, only the next hop is specified. The output interface is derived from the next hop.