Fragmentation Needed: Building Nexus vPC Keepalive Links

Thursday, January 26, 2012

Building Nexus vPC Keepalive Links

There's some contradictory and unhelpful information out there on vPC peer keepalive configuration. This post is a bit of a how-to, loaded with my opinions about what makes sense.

What Is It?
While often referred to as a link, vPC peer keepalive is really an application data flow between two switches. It's the mechanism by which the switches keep track of each other and coordinate their actions in a failure scenario.

Configuration can be as simple as a one-liner in vpc domain context:

vpc domain <domain-id>
peer-keepalive destination <peer-ip-addr>

Cisco's documentation recommends that you use a separate VRF for peer keepalive flows, but this isn't strictly necessary. What's important is that the keepalive traffic does not traverse the vPC peer-link nor use any vPC VLANs.

The traffic can be a simple L2 interconnect directly between the switches, or it can traverse a large routed infrastructure. The only requirement is that the switches have IP connectivity to one another via non-vPC infrastructure. There may also be a latency requirement - vPC keepalive traffic maintains a pretty tightly wound schedule. Because the switches in a vPC pair are generally quite near to one another I've never encountered any concerns in this regard.

What If It Fails?
This isn't a huge deal. A vPC switch pair will continue to operate correctly if the vPC keepalive traffic is interrupted. You'll want to get it fixed because an interruption to the vPC peer-link without vPC keepalive data would be a split-brain disaster.

Bringing a vPC domain up without without the keepalive flow is complicated. This is the main reason I worry about redundancy in the keepalive traffic path. Early software releases wouldn't come up at all. In later releases, configuration levers were added (and renamed!?) to control the behavior. See Matt's comments here.

The best bet is to minimize the probability of an interruption by planning carefully, thinking about the impact of a power outage, and testing the solution. Running the vPC keepalive over gear that takes 10 minutes to boot up might not be the best idea. Try booting up the environment with the keepalive path down. Then try booting up just half of the environment.

vPC Keepalive on L2 Nexus 5xxx
The L2 Nexus 5000 and 5500 series boxes don't give you much flexibility. Basically, there are two options:

Use the single mgmt0 interface in the 'management' VRF. If you use a crossover cable between chassis, then you'll never have true out-of-band IP access to the device, because all other IP interfaces exist only in the default VRF, and you've just burned up the only 'management' interface. Conversely, if you run the mgmt0 interface to a management switch, you need to weigh failure scenarios and boot-up times of your management network. Both of these options SPoF the keepalive traffic because you've only got a single mgmt0 interface to work with.
Use an SVI and VLAN. If I've got 10Gb/s interfaces to burn, this is my preferred configuration: Run two twinax cables between the switches (parallel to the vPC peer-link), EtherChannel them, and allow only non-vPC VLANs onto this link. Then configure an SVI for keepalive traffic in one of those VLANs.

vPC Keepalive on L3 Nexus 55xx

A Nexus 5500 with the L3 card allows more flexibility. VRFs can be created, and interfaces assigned to them, allowing you to put keepalive traffic on a redundant point to point link while keeping it in a dedicated VRF like Cisco recommends.

vPC Keepalive on Nexus 7000

The N7K allows the greatest flexibility: use management or transit interfaces, create VRFs, etc... The key thing to know about the N7K is that if you choose to use the mgmt0 interfaces, you must connect them through an L2 switch. This is because there's an mgmt0 interface on each supervisor, but only one of them is active at any moment. The only way to ensure that both mgmt0 interfaces on switch "A" can talk to both mgmt0 interfaces on switch "B" is to connect them all to an L2 topology.

The two mgmt0 interfaces don't back each other up. It's not a "teaming" scheme. Rather, the active interface is the one on the active supervisor.

IP Addressing

Lots of options here, and it probably doesn't matter what you do. I like to configure my vPC keepalive interfaces at 169.254.<domain-id>.1 and 169.254.<domain-id>.2 with a 16-bit netmask.

My rationale here is:

The vPC keepalive traffic is between two systems only, and I configure them to share a subnet. Nothing else in the network needs to know how to reach these interfaces, so why use a slice of routable address space?
169.254.0.0/16 is defined by RFC 3330 as the "link local" block, and that's how I'm using it. By definition, this block is not routable, and may be re-used on many broadcast domains. You've probably seen these numbers when there was a problem reaching a DHCP server. The switches won't be using RFC 3927-style autoconfiguration, but that's fine.
vPC domain-IDs are required to be unique, so by embedding the domain ID in the keepalive interface address, I ensure that any mistakes (cabling, etc...) won't cause unrelated switches to mistakenly identify each other as vPC peers, have overlapping IP addresses, etc...

The result looks something like this:

vpc domain 25
peer-keepalive destination 169.254.25.2 source 169.254.25.1 vrf default
vlan 2
name vPC_peer_keepalive_169.254.25.0/16
interface Vlan2
description vPC Peer Keepalive to 5548-25-B
no shutdown
ip address 169.254.25.1/16
interface port-channel1
description vPC Peer Link to 5548-25-B
switchport mode trunk
switchport trunk allowed vlan except 1-2
vpc peer-link
spanning-tree port type network
spanning-tree guard loop
interface port-channel2
description vPC keepalive link to 5548-25-B
switchport mode trunk
switchport trunk allowed vlan 2
spanning-tree port type network
spanning-tree guard loop
interface Ethernet1/2
description 5548-25-B:1/2
switchport mode trunk
switchport trunk allowed vlan 2
channel-group 2 mode active
interface Ethernet1/10
description 5548-25-B:1/10
switchport mode trunk
switchport trunk allowed vlan 2
channel-group 2 mode active

The configuration here is for switch "A" in the 25th pair of Nexus 5548s. Port-channel 1 on all switch pairs is the vPC peer link, and port-channel 2 (shown here) carries the peer keepalive traffic on VLAN 2.

46 comments:

MarkkuJanuary 28, 2012 at 6:07 PM
This comment has been removed by the author.
ReplyDelete
Replies
Markku LeiniöJanuary 28, 2012 at 6:08 PM
Great post!

Have you thought about using "dual-active exclude interface-vlan 2"?
ReplyDelete
Replies
AnonymousJanuary 28, 2012 at 7:13 PM
Hey!

I´m quite new to this Nexus world and your nice posts are exactly what i need!
Thanks a lot!

Best Regards,
Johan
ReplyDelete
Replies
chris margetJanuary 28, 2012 at 7:24 PM
@Johan - You're welcome, thanks for letting me know that the post was helpful!

@Markku - Vlan 2 isn't a "vPC VLAN" because it's not allowed onto Po1. I expect it to be immune from the sort of shutdown that 'dual-active exclude' would protect it from (says chris without testing).
ReplyDelete
Replies
Markku LeiniöJanuary 29, 2012 at 2:09 PM
Oops, that's right, I missed the peer link "except" configuration.
ReplyDelete
Replies
ColbyJanuary 31, 2012 at 6:59 PM
Probably a dumb question (get used to it from me), but why are you running Loopguard? Doesn't Bridge Assurance take its place (and do everything "better")?
ReplyDelete
Replies
chris margetFebruary 3, 2012 at 1:33 PM
Hey Colby, that's a really *good* question. ...And not your first one, I might add :-)

I've been pondering it for a couple of days, and I think you're probably right.

I'd been thinking of Bridge Assurance mainly as a mechanism for upstream switches to monitor downstream switches, because that's the new functionality introduced by BA. But as you point out, it works in both directions. Upstream and downstream switches monitor *each*other* with BA.

Loop Guard OTOH only specifies the response of the downstream switch in the face of loss of the upstream.

I'm thinking you're probably right. Loopguard doesn't seem to be adding anything to this configuration.

Thanks for pointing that out!
ReplyDelete
Replies
DanVoyerMarch 1, 2012 at 4:21 PM
if you are to use rfc3330 for PeerKeepAlive, then you just found your reason why you want it in a specific VRF other than default. Just to make sure your routing table in default will never have theses routes.

In the future if you wish to start other protocol in the default, such as LDP, IGP xyz, it'll be cleaner.

dan
ReplyDelete
Replies
chris margetMarch 2, 2012 at 2:52 PM
"if you are to use rfc3330 for PeerKeepAlive, then you just found your reason why you want it in a specific VRF"

On Nexus 7K there's no reason not to use a "keepalive" VRF. ...Though I don't see much downside to having these special addresses in the default VRF, I definitely wouldn't allow them to propagate via a routing protocol.

On Nexus 5K, it's more of a problem. Whatever address you use, it'll land in the default VRF (unless you use the mgmt0 interface), and it won't be in the IGP (unless you're using an L3 5K -- a box for which I have yet to see a need).
ReplyDelete
Replies
AnonymousApril 30, 2012 at 3:00 PM
The problem with using SVI for keep-alive is incapability to upgrade using ISSU. As I understand the VLAN is local to the 5ks and one of them will be root and will have non-edge designated port. This will cause ISSU to be disruptive.
ReplyDelete
Replies
AnonymousMay 28, 2013 at 9:06 PM
I'm definitely new to the 7Ks and had a question regarding VDCs and how the keep-alive link works. In short, do i need to use a separate physical link for each VDC instance for a keep-alive link? I know that each VDC requires physical VPC peerlinks (eg. 2 physical links for VDC1 and 2 physical links for VDC 2) but can i use one keep-alive link to mangage both VDC's VPCs? Hope that question makes sense. Thanks for the help
ReplyDelete
Replies
chris margetMay 28, 2013 at 10:19 PM
VDCs own physical ports, so if you're using line card ports, then yes: physical port/cable per VDC.

OTOH, if you run keepalive traffic over the mgmt0 links (through an L2 infrastructure!) then the VDCs can all share one (or two) cables.
ReplyDelete
Replies
Ethan MelloulSeptember 24, 2013 at 5:11 PM
I ran into a weird bug when configuring our Nexus 5548's referenced here:
https://supportforums.cisco.com/thread/2151975

The reason why I bring it up is because I wasn't aware of the caveat of not having management traffic traverse the VPC Peer-Link and your VPC information saved me. You're right, your write-up is the only one that is actually accurate, there's a TON of incorrect information out there. Just wanted to say thanks!
ReplyDelete
Replies
chris margetSeptember 24, 2013 at 5:39 PM
Hey Ethan,

Thank you for taking the time to let me know you appreciate this post!

/chris
ReplyDelete
Replies
AnonymousNovember 23, 2013 at 1:21 AM
fyi: a dedicated vrf can be created and used on the vpc keepalive link svi on a layer 2 only nexus 5k
ReplyDelete
Replies
chris margetNovember 23, 2013 at 2:47 AM
"a dedicated vrf can be created and used on the vpc keepalive link svi on a layer 2 only nexus 5k"

Something new then? This definitely wasn't possible back when I was working on these boxes.
ReplyDelete
Replies
DGS-RedApril 8, 2015 at 4:51 PM
Chris, I was just listening to the Packet Pushers podcast episode you guested on for the Nexus deep-dive. I really enjoyed your input. However, in the episode you recommended using the management interfaces for building the peer-link keep-alive. However, in this article you recommend using a 10-gig port cross-connect w/ a keep-alive vlan and SVI. Also, if I have a pair of Nexus 5648's, while I can now create a keep-alive specific VRF, should I waste a 40gig port on each or just use the management ports?
ReplyDelete
Replies
DGS-RedApril 8, 2015 at 6:55 PM
Sorry, one more question, in the PP deep dive, you mention that another port-channel between the two Nexus chasis is required for non-vpc vlans in an active/standby fail-over situation. Assuming then that aside from the peer-link, will I need to also make a cross-connect between the 5k's and config on it a port-channel that will accomadate the non-vpc vlans (i.e. those vlans not allowed over the peer-link)?
ReplyDelete
Replies
UnknownMarch 14, 2016 at 5:45 PM
Hi everyone

can some one explain, y we will place peer-keepalive in vrf management domain, but not peer-link
ReplyDelete
Replies
brutus850April 20, 2016 at 2:44 PM
Great article,
Maybe one can solve a problem for me, involving vpc's and private-vlans.
At my company we use vpc's between Nexus 5k (vpc) and ASR9k(mlacp). The vpc portchannel are configured as a trunk with vlans allowed (lacp).
Now we want to change it to "switchport mode private-vlan trunk promiscuous".
I have done some testing, when removing the configuration and pasting the private-vlan config, there will be outage.

All the technics like "Graceful Consistency Check" and "lacp suspend-individual", are in place.

Is there a way to change this configuration without outage?
Is the solution in: 2. Use an SVI and VLAN (of above article)?

I really appreciate a reply.
ReplyDelete
Replies
Matt HaedoMay 27, 2016 at 6:07 PM
I just wanted to add a bit of clarification regarding your use of link local addressing on the vPC keepalive link, since while it works, I was not sure if it is RFC compliant. As you mentioned, you're not using RFC3927 auto-configuration. RFC3927 states the following:

1.6. Alternate Use Prohibition

Note that addresses in the 169.254/16 prefix SHOULD NOT be configured
manually or by a DHCP server.
...
Administrators wishing to configure their own local addresses (using
manual configuration, a DHCP server, or any other mechanism not
described in this document) should use one of the existing private
address prefixes [RFC1918], not the 169.254/16 prefix.

The rationale stated in the RFC is that that this could cause the host not to follow special rules regarding duplicate detection and auto-configuration. However, this isn't relevant to our particular use case because this is a direct link with no other devices on the same L2 segment. No other devices should ever exist on this segment, and therefore duplicate detection should not be required.

RFC2119 states the following about the wording “SHOULD NOT”:

SHOULD NOT This phrase, or the phrase "NOT RECOMMENDED" mean that
there may exist valid reasons in particular circumstances when the
particular behavior is acceptable or even useful, but the full
implications should be understood and the case carefully weighed
before implementing any behavior described with this label.

Therefore, since we understand the full implications of the link local address space, I believe using link local address space on vPC keepalive links (or other similar links) is a valid use of the address space as per the RFC.
ReplyDelete
Replies
chris margetMay 27, 2016 at 6:56 PM
Hi Matt,

Thanks for your thoughts on the matter. I did similar research, came to the same conclusion.

Much of my decision in this regard hinges on the fact that 169.254/16 was designated for "link local" use long before RFC3927 came around. The document (dated 2005) mentions the fact that Win98 was already using this address block. RFC3927 didn't invent the block, it's merely trying to improve interoperability between devices auto configuring themselves within the same broadcast domain.

One other detail: I don't think the network devices in question actually know that this /16 is special. Of course, that may change :)
ReplyDelete
Replies
ronaldoMay 4, 2021 at 9:01 AM
Building Regulations they won't give you with a last authentication and also will drop the underlying notification by informing your nearby power. XMX London Ltd
ReplyDelete
Replies
talhaMay 9, 2021 at 12:20 PM
These affirmed monitors don't have the ability to really authorize this yet in the event that your building work doesn't follow the Building Regulations.https://matterhorn-wholesale.com/
ReplyDelete
Replies
styenMay 16, 2021 at 10:11 AM
Activity Property Inspections have been giving home building investigation guidance to home purchasers in the more noteworthy Brisbane, Logan, Ipswich and encompassing regions since 1995. https://buildingrenovation4u.co.uk/
ReplyDelete
Replies
Richard C. LambertJuly 10, 2021 at 7:18 AM
Thanks for sharing nice information with us. i like your post and all you share with us is uptodate and quite informative, i would like to bookmark the page so i can come here again to read you, as you have done a wonderful job. marketing1on1.com/professional-seo-services-company
ReplyDelete
Replies
sameeJuly 12, 2021 at 7:10 AM
This is my first time visit to your blog and I am very interested in the articles that you serve. Provide enough knowledge for me. Thank you for sharing useful and don't forget, keep sharing useful info: Gebäudereinigung Hardegsen
ReplyDelete
Replies
UnknownJuly 27, 2021 at 10:09 AM
I got too much interesting stuff on your blog. I guess I am not the only one having all the enjoyment here! Keep up the good work seo
ReplyDelete
Replies
strangeJuly 29, 2021 at 8:22 AM
This doesn't influence the appropriate for the nearby position to apply for a directive in the courts for a similar reason. Stump Grinding Pearland
ReplyDelete
Replies
Shona67August 8, 2021 at 6:49 PM
I really liked your article post.
Tree Trimming team Irvine, CA
ReplyDelete
Replies
Kamla LogirAugust 12, 2021 at 6:40 PM
Machos Gracias for your post. Much thanks again.
Emergency Tree Removal Austin
ReplyDelete
Replies
sameerAugust 24, 2021 at 8:56 AM
This is highly informatics, crisp and clear. I think that everything has been described in systematic manner so that reader could get maximum information and learn many things. Treppenhausreinigung Einbeck
ReplyDelete
Replies
CheaterSykoSeptember 18, 2021 at 6:56 AM
This comment has been removed by the author.
ReplyDelete
Replies
CheaterSykoSeptember 19, 2021 at 6:49 AM
Stagers are extremely ingenious and can take certain components of the home and once again reason and reuse to help the mortgage holder set aside cash. condo leasing
ReplyDelete
Replies
abarie1September 30, 2021 at 7:50 PM
On the off chance that a structure is let to a decent quality occupant for an extensive stretch then the rental pay is guaranteed regardless of whether economic situations for property are unstable. los angeles luxury condos for rent
ReplyDelete
Replies
alamOctober 7, 2021 at 3:33 PM
This was a really great contest and hopefully I can attend the next one. It was alot of fun and I really enjoyed myself.. Treppenhausreinigung Braunlage
ReplyDelete
Replies
waqasFebruary 1, 2022 at 1:08 PM
This is my first time i visit here. I found so many interesting stuff in your blog especially its discussion. From the tons of comments on your articles, I guess I am not the only one having all the enjoyment here keep up the good work building trump royale Condos
ReplyDelete
Replies
Shona67February 18, 2023 at 2:10 PM
This comment has been removed by the author.
ReplyDelete
Replies
Sinen YokuFebruary 18, 2023 at 2:18 PM
Most quality shed plans will likewise accompany a materials list-a significant piece of any arrangement whether you draw it yourself or buy one.
kit homes
ReplyDelete
Replies

Add comment