Thursday, May 2, 2013

When CDP doesn't discover

Marko's myths of VLAN 1 post continues to drive a lot of traffic to my response about CDP, tagging and the magic properties of VLAN 1.

Today I was introduced to a related phenomenon that I found interesting.

Backstory
CDP messages sent by switches are always in VLAN 1. If something other than VLAN 1 is the native VLAN on a particular trunk, then the CDP frame will be tagged with "1".

Funny Business
According to the post linked above there are related conditions that break CDP altogether.

The required elements are:
  1. A switch sending tagged CDP frames, either because it's using something other than VLAN 1 as the native VLAN or it's been configured with vlan dot1q tag native
  2. A router-on-a-stick that does NOT have a subinterface configured with encapsulation dot1q 1
Apparently the router, having no subinterface configured to receive frames tagged with "1", will toss incoming CDP frames without bothering to look inside to find the CDP message, killing CDP operation altogether. Bummer. And kind of unexpected.

I haven't tested this behavior, nor do I even have an IOS-XR (where the problem was found) box available. I suspect that IOS and IOS-XE systems might have a similar problem because I recently discovered that IOS-XE subinterfaces using VLAN 1 automatically appended the native keyword even if I didn't type it. Yes, I configured a shiny new ASR as on VLAN 1. It wasn't my fault, I was integrating with an established L2 topology and required VLAN tagging.

Why did this even come up?
Frankly, I can't fathom why folks obsessively set their native VLANs to something other than the default. I've changed the native VLAN on trunks where I actually needed to pass a particular VLAN without a tag, but never as a matter of default configuration like is so common in our industry.

Why is everyone typing switchport trunk native vlan X everywhere? Is there a good reason? If you're not using VLAN 1, and it's not allowed on the trunk, then why worry about which VLAN would be untagged on the link if it were allowed?

If there's a good reason for obsessively setting the native VLAN, please let me know in the comments. Comments including the phrase "orfg cenpgvpr" (ROT13) will be deleted. 

13 comments:

  1. I've had to change it before because I've had firewall pairs that used the untagged VLAN for their heartbeat/keepalive mechanism. Works great until you have more than one pair of firewalls connected to a switch, at which point the "leave everything in VLAN1" concept gets blown out of the water. Consequently, for those instances, setting a non-default native VLAN was essential.

    Beyond that, so long as you don't have your user ports in VLAN1, who cares?

    ReplyDelete
  2. Since Vlan 1 is suspect (ie you may find it configured on an unconfigured port) then the below 'old school' US DoD requirement may still apply:

    VLAN hopping can be initiated by an attacker who has access to a switch port belonging to the same VLAN as the native VLAN of the trunk link connecting to another switch in which the victim is connected to. If the attacker knows the victim’s MAC address, it can forge a frame with two 802.1q tags and a layer 2 header with the destination address of the victim. Since the frame will ingress the switch from a port belonging to its native VLAN, the trunk port connecting to victim’s switch will simply remove the outer tag because native VLAN traffic is to be untagged. The switch will forward the frame unto the trunk link unaware of the inner tag with a VLAN ID for which the victim’s switchport is a member of.

    ReplyDelete
  3. @John H: Makes sense - that's a case where you actually had a requirement to pass traffic on specific untagged VLANs.

    @Will: The old school VLAN hopping attack requires the access LAN to be both 1) native and 2) allowed on the trunk. Also, it seems silly to suggest that the solution to sloppy configuration (or completely unconfigured switches) is to require additional configuration elements. The solution isn't dedicated "native" VLANs, it's pruning. Lets just get the basics right, y'know?

    I'm generally grumpy about "orfg cenpgvpr" nonsense, but seem to be especially grumpy about this detail lately. On a recent job I had a design rejected by management because my trunks were configured with 'mode trunk' and 'allowed vlan 10-19'. I didn't mention 'native'.

    They required me to create a nonsense vlan named "native" on every switch, assign it as the native VLAN on every trunk, and leave it disallowed. Because "orfg cenpgvpr".

    ReplyDelete
    Replies
    1. i can see both sides

      btw what does "orfg cenpgvpr" mean? first google results show as a foreign language.....

      Delete
    2. ROT13 = rotate 13. Apply it to the bad words.

      Delete
  4. Its sometimes useful for interoperation.

    I have device A that for some reason tags every packet on a trunk link regardless of vlan membership (ie there is no way to configure a 'native' VLAN on the trunk)
    I have device B that does require one VLAN to be native, but doesn't allow you force that VLAN to be tagged with 'vlan dot1q tag native' (I'm looking at you, Cisco 2960)
    To get the native VLAN to interoperate across the trunk, you use a bogus vlan in 'switchport trunk native vlan xxx' to force the 2960 to tag its native vlan on the trunk

    A bit of an edge case I agree, but the command has saved me on more than one occasion

    ReplyDelete
  5. @Anon 4:19
    I'm not following you. My proposal is that we *should* be tagging every VLAN allowed on intfrastructure links. Hand-in-hand with that is "don't allow VLAN 1 on the link".

    If we're not allowing '1', and we never type "native", then every VLAN is tagged on the link.

    That, and typing 'switchport trunk native vlan xxx' as you suggest doesn't force the 2960 to tag its native VLAN on the trunk. Rather, it forces the 2960 to tag VLAN 1 on the trunk, and it moves the native to some other value.

    ReplyDelete
  6. I want to say (but it's been so long that I can't remember for sure) that there was something about early Cisco VoIP implementations and/or older switches that required the access VLAN to match the native VLAN when the "switchport voice vlan" command was used... maybe the early phones couldn't understand multiple tags, and the voice VLAN was already tagged? I can't remember now. However, I have noticed that a lot of places that "obsessively set the native VLAN" are those that have had long-running Cisco VoIP installations.

    ReplyDelete
  7. This comment has been removed by a blog administrator.

    ReplyDelete
    Replies
    1. Hi Marvin,

      You ran afoul of the forbidden phrase I mentioned in the last sentence of my post.

      Yes, I understand VLAN hopping attacks. Note that the attack mentioned in the document you cited requires a native VLAN (any one, really) to be up (allowed, active, forwarding, not pruned) on a trunk.

      My proposal is that the better way to mitigate this threat is to do away with untagged traffic altogether. By allowing only VLANs we intend to use (that is, NOT 1) onto each trunk, the threat is mitigated without resorting to merely obscuring the native VLAN.

      Note also that for an attacker on a tagging port (the scenario laid out in your link), obscuring the native VLAN only helps if you use a different value on each link because CDP (which they don't recommend that we disable) announces the native value, undermining the "attacker has specific knowledge of the 802.1Q native VLAN" angle promoted by that document.

      While I have seen it done (*sigh* - customers, right?), I don't think anyone here is suggesting that each trunk have a unique native VLAN number assigned to it...

      Delete
  8. This comment has been removed by a blog administrator.

    ReplyDelete
  9. Very interesting article. I'll admit, I've done this in the past because security docs told me to, but it always felt a bit redundant and excessive. Security docs suggested that I allocate a VLAN other than one as the "native" VLAN, prune it, create a "black hole" VLAN and assign unused ports to it, and finally shut unused ports. I agree with pruning and shutting off ports, but the other configs seem unnecessary when you're pruning and shutting the ports.

    ReplyDelete
  10. one reason for using native vlan, would be to allow a device to still use the native vlan for communication. This only works if the switch is configured with a native vlan, sees the trunk as up, and the device on the other side either doesn't support trunks or doesn't see it as up. In this case, the device can still communicate on the native vlan.

    ReplyDelete