Fragmentation Needed: Native VLAN - Some Surprising Results

I did some fiddling around with router-on-a-stick configurations recently and found some native VLAN behavior that took me by surprise.

The topology for these experiments is quite simple, just one router, one switch, and a single 802.1Q link interconnecting them:

Dead Simple Routers-On-A-Stick Configuration

The initial configuration of the switch looks like:

vlan 10,20,30
!
interface FastEthernet0/34
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 10,20,30
switchport mode trunk
spanning-tree portfast trunk
!
interface Vlan10
ip address 192.0.2.1 255.255.255.0
!
interface Vlan20
ip address 198.51.100.1 255.255.255.0
!
interface Vlan30
ip address 203.0.113.1 255.255.255.0

And the initial configuration of the router looks like:

interface FastEthernet0/0
no ip address
duplex auto
speed auto
!
interface FastEthernet0/0.10
encapsulation dot1Q 10
ip address 192.0.2.2 255.255.255.0
!
interface FastEthernet0/0.20
encapsulation dot1Q 20
ip address 198.51.100.2 255.255.255.0
!
interface FastEthernet0/0.30
encapsulation dot1Q 30
ip address 203.0.113.2 255.255.255.0

So, nothing too interesting going on here. The devices can ping each other on each of their three IP interfaces.

We can switch VLAN tagging off for one of those VLANs, by making it the native VLAN for the link. On the switch we'll do:

S1(config-if)#switchport trunk native vlan 20

And on the router we have two choices:

1) Eliminate the FastEthernet0/0.20 subinterface altogether, and apply the IP to the physical interface:

no interface FastEthernet 0/0.20
interface FastEthernet 0/0
ip address 198.51.100.2 255.255.255.0

2) Use the native keyword on the subinterface encapsulation configuration:

interface FastEthernet0/0.20
encapsulation dot1Q 20 native

Both configurations leave us with full connectivity on all three subnets, with the 198.51.100.0/24 network's packets running untagged on the link. So, what's the difference between them?

The Cisco 360 training materials have this to say about the encapsulation of 802.1Q router subinterfaces:

you can assign any VLAN to the native VLAN on the router and it will make no difference

But the truth is a little more nuanced than that.

First of all, option 1 makes it impossible to administratively disable just the 198.51.100.2 interface. Shutting this down will knock all of the subinterfaces offline as well (thanks, mellowd!)

To see the next difference between these router configuration options, let's introduce a misconfiguration by omitting switchport trunk native vlan 20 from the switch interface.

If we've configured option 1 on the router, pretty much nothing works. Frames sent by the switch with VLAN 20 tag are ignored when they arrive at the router, and untagged frames sent by the router are similarly ignored by the switch.

On the other hand, if wev'e configured option 2 on the router, things kind of start to work with the mismatched trunk configuration. A ping from the switch doesn't succeed, but elicits the following response from 'debug ip packet' on the router:

IP: tableid=0, s=198.51.100.1 (FastEthernet0/0.20), d=198.51.100.2 (FastEthernet0/0.20), routed via RIB
IP: s=198.51.100.1 (FastEthernet0/0.20), d=198.51.100.2 (FastEthernet0/0.20), len 100, rcvd 3
IP: tableid=0, s=198.51.100.2 (local), d=198.51.100.1 (FastEthernet0/0.20), routed via FIB
IP: s=198.51.100.2 (local), d=198.51.100.1 (FastEthernet0/0.20), len 100, sending

The mistakenly tagged packet was accepted by subinterface Fa0/0.20! So, router configuration option #2 does something pretty interesting: It causes the subinterface to send untagged frames, just like the 'native' function suggests, but it allows the subinterface to receive either untagged frames, or frames tagged with the number specified in the subinterface encapsulation directive. Clearly the number we use makes a difference after all, and I found the behavior pretty interesting.

Catalyst switches have a similarly interesting feature: vlan dot1q tag native

When this command is issued in global configuration mode, it causes the switch to tag all frames egressing a trunk interface, even those belonging to the native VLAN. The curious thing about this command is that it causes the interface to accept incoming frames into the native VLAN whether they're tagged, or untagged. It's kind of the reverse of the curious router behavior, and leads to an interesting interoperability mode in which:

The router generates untagged frames, but will accept either tagged or untagged frames using:

interface FastEthernet0/0.20
encapsulation dot1Q 20 native

The switch generates tagged frames, but will accespt either tagged or untagged frames using:

vlan dot1q tag native
int FastEthernet 0/34
switchport trunk native vlan 20

And the result is that we have full interoperability between a couple of devices that can't agree on whether an interface should be tagged or not, because they both have liberal policies regarding the type of traffic they'll accept.

Here's the result (three pings) according to my sniffer:

02:04:05.284217 00:0b:fd:67:bf:00 > 00:07:50:80:80:81, ethertype 802.1Q (0x8100), length 118: vlan 20, p 0, ethertype IPv4, 198.51.100.1 > 198.51.100.2: ICMP echo request, id 36, seq 0, length 80

02:04:05.287826 00:07:50:80:80:81 > 00:0b:fd:67:bf:00, ethertype IPv4 (0x0800), length 114: 198.51.100.2 > 198.51.100.1: ICMP echo reply, id 36, seq 0, length 80

02:04:05.288391 00:0b:fd:67:bf:00 > 00:07:50:80:80:81, ethertype 802.1Q (0x8100), length 118: vlan 20, p 0, ethertype IPv4, 198.51.100.1 > 198.51.100.2: ICMP echo request, id 36, seq 1, length 80

02:04:05.292322 00:07:50:80:80:81 > 00:0b:fd:67:bf:00, ethertype IPv4 (0x0800), length 114: 198.51.100.2 > 198.51.100.1: ICMP echo reply, id 36, seq 1, length 80

02:04:05.292904 00:0b:fd:67:bf:00 > 00:07:50:80:80:81, ethertype 802.1Q (0x8100), length 118: vlan 20, p 0, ethertype IPv4, 198.51.100.1 > 198.51.100.2: ICMP echo request, id 36, seq 2, length 80

02:04:05.296703 00:07:50:80:80:81 > 00:0b:fd:67:bf:00, ethertype IPv4 (0x0800), length 114: 198.51.100.2 > 198.51.100.1: ICMP echo reply, id 36, seq 2, length 80

The pings succeeded, so we know that we have bidirectional connectivity. Nifty.

9 comments:

Dmitri KalintsevJuly 27, 2012 at 11:12 PM
Hi Chris,

One more interesting thing to try out - have a look how your router and switch respectively send their L2 control protocol traffic (STP, LLDP, CDP and so on). Most of these protocols must travel untagged, irrespective of interface configuration.
Matt OswaltJuly 27, 2012 at 11:30 PM
This is similar to something I ran across in a security audit - turns out that access ports will still accept tagged frames as long as they're tagged with the VLAN that the access port is a member of. Seems like this same logic has been applied to what you've discovered here.

Great post!
MattJuly 28, 2012 at 6:50 PM
Interesting post Chris. I know from the dim and distant past that accepting untagged frames as belonging to a native VLAN was useful if you had a hub inbetween two switches (or swithc and router as in your post), which would still allow a host connected to the hub to communicate on a particular VLAN despite the hub not having any idea about them.

What was more useful than that was to throw the hub in the bin and replace it with a switch. ;-)

Friday, July 27, 2012

Native VLAN - Some Surprising Results

9 comments: