Thursday, September 29, 2011

BFD - Funny business on Nexus

Imagine an OSPF broadcast network with adjacencies built between 3 routers.  Logically, it looks like this:
Logical Topology

Now, I'm no big fan of OSPF broadcast networks, and this is why:
  • On a point-to-point OSPF interface, neighbor failure can often be detected based on interface state.  If the interface goes down, then you know the neighbor is dead.
  • If there are only 2 neighbors on a broadcast segment, you might as well use point-to-point mode.  Adjacencies build faster (no DR/BDR election) and the LSDB doesn't get needlessly cluttered with type 2 LSAs.  Also, the node map used by Dijkstra's algorithm is simpler, meaning you're less likely to run into silly nonsense like this.
  • With more than 2 neighbors on a broadcast segment, you can never rely on interface state as an indicator of the availability of an individual neighbor.
So, in the network I've described above, a node failure will go undetected for 30-40 seconds with default timers.

I thought that I'd use Bidirectional Forwarding Detection (BFD) to speed up convergence in this network.  BFD is a protocol agnostic tool that tests forwarding capability/availability of neighboring routers.  Routing protocols (like OSPF) request that BFD test the availability of its neighbors.  If the neighbor fails, BFD tells the client (OSPF) about the problem and the client reacts accordingly.  OSPF will remove the adjacency and reconverge around the failed router.

I arranged a test using 3 Nexus 7000s.  The logical configuration looked like the picture above, and the physical configuration looked like this:
Physical Topology for BFD Test
Everything worked great.  Each Nexus established a BFD session to monitor the availability of each neighbor.

Then I reconfigured the topology to look like this:
BFD session not working through L2/L3 switch?
All three switches are connected in a line.  VLAN 10 is forwarding on both links, and all three switches are OSPF neighbors.  BFD on each switch knows it's supposed to have 2 neighbor sessions up and running, but only B sees two BFD neighbors.  A can't see C, and C can't see B.  Weird.

I connected the thre switches in a triangle with STP blocking one of the links.  As I moved the STP root (and thus the blocking link) around, the switches that failed to establish BFD sessions moved as well.
But with the SVI disabled on switch B, A and C can suddenly see each other!
SVI shutdown - Missing BFD Session Lives!

This is strange stuff.  The BFD session between A and C should be getting L2 transit from switch B, but it doesn't work if switch B has an L3 interface active on this VLAN.

1 comment:

  1. We understand why this happening, and working on it. If you want to contact me email icox at cisco