Tuesday, November 23, 2010

BGP Adjacency - Spot The Error

A couple of years ago I configured a topology for a business partner extranet much like the one sketched below.

No dynamic routing was allowed on the firewall.  Layer 9 didn't trust it to run an IGP, so the firewall was configured with static routes:
 - Known internal nets (registered and 1918 space) pointed in
 - Default route pointed out

Two eBGP sessions were configured to learn business partner prefixes (not shown) from the external switch, and redistribute them into the IGP.  It was a small number of prefixes, and they were thoroughly filtered and quantity-limited, making things safe for the IGP.

But it didn't work correctly:  Only one BGP session could be brought up at a time, but never both at once.

The cause of the error took me more hours of head-scratching than I care to admit.  In my defense, the topology was actually quite a bit more complicated than depicted here.  Presented here is the bare minimum required to recreate the problem.

The problem was neither a firewall policy issue, nor a typo.  Any typos here are just typos.

Can you spot my mistake?  Which session comes up, and what's wrong with the other one?

13 comments:

  1. Remote AS number is incorrect.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Remote AS number fixed. The problem lies elsewhere.

    ReplyDelete
  4. Running HSRP? Which is active/passive. You can't create a BGP session with the standby router?

    ReplyDelete
  5. ebgp multihop should be 3

    ReplyDelete
  6. I'm guessing it has something to do with your HSRP setup on on vlan20 on both internal routers. The default route on the firewall for internal subnets points to ~20.1, which is the HSRP virtual IP, and that IP only maps to one of the routers at a time. Since BGP is using loopbacks for src/dest IPs, the path to both internal routers will head through default route ~20.1. I bet the firewall drops the traffic to one of the routers because the incoming and outgoing traffic to it are using different interfaces (asymmetric). Removing HSRP should fix it.

    ReplyDelete
  7. Removing HSRP isn't an option: The firewall doesn't run a routing protocol, so some FHRP is required.

    The firewall isn't dropping any packets.

    ReplyDelete
  8. Is it not the ebgp multihop either? Very curious on the answer to this.

    ReplyDelete
  9. The problem was a combination of HSRP and eBGP multihop.

    Because of the firewall's routes pointing at the HSRP address, one of the internal routers was 3 hops away (in the inbound direction only):

    A -> external: 2 hops
    B -> external: 2 hops
    external -> HSRP primary: 2 hops
    external -> HSRP secondary: 3 hops

    I fixed this by adding 32-bit routes on the firewall:
    192.168.255.1 -> 192.168.20.2
    192.168.255.2 -> 192.168.20.3

    Routing traffic from the VLAN 20 (or 30) interface to the Lo0 interface doesn't count as a hop, so "multihop 2" is sufficient so long as we're not taking the *extra* hop across VLAN 10.

    ReplyDelete
  10. I would also have been concerned about BGP through the firewall. Certain firewalls may randomise the TCP Sequence number which breaks BGP. (e.g. Cisco ASA).

    ReplyDelete
  11. Hey Greg, thanks for your comment.

    These were ScreenOS boxes. I don't know if they play games with the TCP ISN, but I didn't have any problems in that regard.

    Either way, I /think/ that the ISN randomization is only an issue if MD5 authentication is configured between the neighbors.

    I'm not completely sure about that, but can't see how else the BGP session would notice ISN games played by an intermediate device.

    ReplyDelete
  12. Chris,

    Great blog; I've enjoyed reading through your archives. The problem you describe here is also present when you use 'vpc peer-gateway' on the firewall-facing VLAN. Not that this has much to do with your issue here; just an FYI.

    Jeremy Filliben

    ReplyDelete
    Replies
    1. Hey Jeremy, thank you for the compliment.

      Does 'vpc peer-gateway' decrement TTL when bridging through the "wrong" Nexus 7K?

      Maybe that's not what you meant.

      Delete