Monday, November 11, 2013

ACI Launch

Tech Field Day brought me to the Cisco Application Centric Infrastructure launch event last week in New York. I attended at someone else's expense, but that doesn't mean my opinions are for sale, etc...

If you're totally unfamiliar with ACI (formerly Insieme), I recommend listening to Episode 12 of the Class C Block podcast with guest Joe Onisick. This was far more informative than anything I encountered at the actual launch event, probably because the Tech Field Day crew went straight from the John Chambers presentation into a room where we recorded a roundtable discussion. There may have been some technical discussion going on next door, but I missed it.

There's no shortage of people expressing opinions about ACI and what it will or won't do for you, most of whom have beaten me to the punch by several days. I'm going to post instead about a few details of the launch that I found interesting.

Defining Policy Might Not Be Easy
ACI requires that applications (really application owners) express to it the relationships between nodes before any traffic is allowed to flow. There are countless ways this might happen, but they all boil down to figuring out which ports on virtual or physical switches need to communicate to make the application run. There are the obvious flows, like the ones between middleware and database, and then the less obvious ones like syslog, DHCP, DNS, RADIUS, etc... All communications will need to be identified to the ACI controller, and I worry that it will turn out to be nontrivial.

While commercial applications generally have their requirements spelled out pretty clearly for firewall purposes, I've found that in-house applications often do not. The application developers know their components, but often don't know what's really happening under the covers. Too often I have conversations about firewall policy that include statements like the following: "It's not UDP or TCP. I keep telling you, I'm using an MQ library!"

Canned Policies?
Somebody (I think it was Pete Welcher?) speculated that software vendors might decide to ship ACI policies with their products. In the same way that we deploy virtual appliances, large software packages could come with canned ACI policies. That would certainly make things easy for rollout of a multi-tier software package onto an ACI environment.

Ultimately, The Policy Expresses What Now?
If I understand things correctly, the policy exists to fit switchports into categories and then to express which categories of ports may talk to one another. The categorization can be based on all sorts of criteria including things that might be reminiscent of firewall policy. The Cisco folks even evoke firewall notions when they explain that the system defaults to no communication: "Hosts can't talk without a policy explicitly allowing it."

The security angle is compelling, but don't confuse it with a packet filter. Think of it more like a policy which assigns assets to VLANs in a router-free environment. Once the assignment is done, any and all communication is possible between assets which have been bucket-ized together. The "express communication requirements to the network in application terms" business isn't a packet filter.

About ASICs
Cisco's "merchant silicon plus" strategy with the Nexus 9500 speaks volumes about the future of packet forwarding hardware. There's a proud tradition of using in-house ASICs at Cisco, and they're backing away from it. Sure, they're shipping ACI-specific hardware along with the Broadcom Trident II. It does things that the Broadcom Trident II can't do, like routing between VXLAN and NVGRE overlays. I'm betting that the ACI-specific hardware will sit completely idle at a large percentage of customers, meaning that the Nexus 9500 is a commodity switch running stripped down (there was a slide about this, but I can't find a citation - edit: Gideon Tam shares this slide from the presentation) version of NX-OS.

Fabric Modules
Nexus 7000 fabric modules aren't much to look at. They're inexpensive (compared to the rest of the platform components), don't generate much heat, etc... Nexus 9500 represents a big departure from the arbiter-controlled fabric of the 7000 series, because they've got switching ASICs (two to four Trident IIs per fabric module) onboard, making the Nexus 95xx a "clos-in-a-box" architecture. I'm anxious to see the packet walk for multicast and broadcast traffic from the Cisco Live presentation next year.

No Midplane
Apparently the 9500 is entirely open from front to back, which is new for Cisco. I guess the line cards and fabric modules connect directly to one another (I haven't seen it yet). This is nifty, because it allows front-to-back airflow without lots of crazy ducting like the Nexus 7010, which seems like it's half duct/half switch.

BiDi Optics
This is cool stuff. Is it unique to Cisco? There's no QSFP module with an LC connector listed on Finisar's site right now.

These modules run 40Gb/s Ethernet for up to 150m over just two strands of MMF. Prior to the introduction of this module, 40Gb/s Ethernet required a 12 strand MPO connector. I've been advising customers for years to populate their data centers with pre terminated MPO stuff, I particularly like the high density offerings from Corning.

Customers who already have MPO don't have any worries, but for folks with minimal LC connectors at top of rack, Cisco claims their QSFP-40G-SR-BD module is a big cost saver because no new fiber needs to be installed for the upgrade from 10Gb/s to 40Gb/s. Yes, the idea of saving money with Cisco optics is hilarious. :) Pricing isn't out yet.

Monday, November 4, 2013

Stuff Spirent didn't cover at NFD6

Spirent presented their Avalanche NEXT product at Gestalt IT's Network Field Day 6 event in September of this year.

  1. I attended the event, somebody else picked up the tab, yada yada.
  2. I like Spirent. I've used their products, worked with their people and been to their offices several times, and have gradually become a fan. This fact might be coloring my opinions :)
What's Avalanche NEXT?
Avalanche NEXT is a software frontend for performing tests of security hardware (firewalls and whatnot). It's a modular system that allows simple creation of test components (clients, servers, subnets, protocols, etc...), mixing of modules to create various tests, scaling of the test, application fuzzing, etc... Check out 1:14 - 1:30 in the video below to get a flavor of how slick the interface is. I love that interactive pie chart.

This sort of testing is critical because security devices have some of the most misleading data sheets of anything we're likely to work with as network folks. Security box performance numbers depend heavily on what you're asking them to do. Will your current device survive if you enable SSL offload, application inspection, or similar features? You can try it out (fingers crossed!), or you can test.

If you're interested in the NFD presentation on Avalanche NEXT, check out the Tech Field Day page.

So, what didn't Spirent tell us at NFD6?
Two things jumped out at me as deserving more attention than they got during the presentation.

PCAP replay?
The Spirent folks explained that you could use the tool to run canned synthetic traffic through a firewall, or you could load your PCAPs of your own applications into the tool, and replay those.

It sounds straightforward, but I think they're underselling this feature, or at least glossing over how much is going on under the hood. This is nothing like dumping data onto the wire with TCPreplay, for example. Even in the simplest case of a generic TCP application, to play captured data through a firewall, they need to:

  • Extract application transactions from the captured data
  • Setup new transactions with unique 5-tuple identifiers
  • Speak independently on behalf of both the client and server
  • Process TCP and IP data independently, taking into account the mangling done to it by the device(s) under test: loss/fragmentation/segmentation/etc...
Factor in application-layer awareness, like rewriting SIP invites to include the correct synthetic client ID and you're dealing with new stuff throughout every layer of the stack, and (so they say), doing it at line rate with 10Gb/s interfaces.

It's way more complicated and interesting than merely "playing back pcaps" at high rates, and it's awesome.

The Spirent guys focused their presentation on testing the transit ability (packet forwarding and deliberate dropping) of security appliances, but didn't mention that they can also do VPN stuff. Testing site-to-site VPN is obvious: just get two of the appliances, configure them, and blast application traffic through the tunnel.

It turns out that Spirent can also run these test suites from thousands of (simulated) VPN users. This is the capability that was most interesting to me because of a network I'm working on these days. Apparently they can simulate several types of VPN user, including Cisco AnyConnect clients. With this sort of test, the VPN believes that many users are online, and they all run the Spirent-controlled suite of application tests.

Tests I was previously familiar with include:
  • Transit load testing (throughput/latency type stuff)
  • Application server loading and fuzzing
  • Synthetic network devices (1000 OSPF neighbors-in-a-box)
Synthetic VPN users, each running synthetic application load was a new one on me, and is relevant to my network.

Friday, November 1, 2013

SDN Themes from ONUG - Community Matters!

I was privileged to attend the Open Networking User Group (ONUG) Conference, ONUG Academy and mini Tech Field Day event hosted by JP Morgan Chase on October 29 and 30.
I attended at someone else's expense. Disclaimer. Additional disclaimer: I have a personal relationship with one of the people behind the ONUG conference. That fact will not color the opinions I express about ONUG. If I say I like something, it's because I like it, okay? :)

SDN Joke from Brent Salisbury's awesome ONUG
I don't know these cats nor the owner of the photo.
Community Matters
ONUG is founded on the idea that the Software Defined Network (SDN) user community needs to stand up for itself. Prior to ONUG the direction of SDN was set by a handful of players including:
  • Vendors who are interested in shaping the SDN marketplace and standards bodies around the capabilities of their products, rather than around the problems being faced by their customers.
  • Powerful end users who needed SDN to solve their own peculiar problems. The problems they're solving, and the techniques they're using do not align well with the challenges nor capabilities of mere enterprise users.
  • Researchers, who took SDN in directions that were academically interesting but didn't necessarily overlap with real world problems.
ONUG provides a forum where end users of network technology can contemplate the promise of SDN,  commiserate about challenges faced, unmask the realities of available offerings, and gain new perspective on the marketplace in a user-centered environment.

Vendors and media were not welcome in many of the discussions, giving typically taciturn customers (especially the financial folks) some freedom to talk openly about their experiences. Real customers talking about real environments proved to be a strong antidote to vendor propaganda. I'm sure that everyone who attended left with something new: knowledge, friends, contacts, perspective, business, etc...

Without ONUG, we'd be forced to buy what the vendors are selling. ONUG shouldn't be missed because it unifies a potentially vocal community which will shape the SDN marketplace.

If you're interested in SDN, you should probably be attending ONUG conferences.

Wednesday, July 24, 2013

Calculating distances in meatspace

I'm working on an automated provisioning system for a very large VPN network. For each new VPN client, I need to select a headend site where VPN tunnels should land. The only data available is that which I can get from the sales and billing systems. This system offers me the zip code of the install site.

Using the zip code of the install site, and the known zip codes of my various head-end sites, I'm able to select the destination for the primary and secondary VPN tunnels.

It's not perfect (physical location often has little to do with network path), but it's better than nothing. I haven't decided how to handle non-US sites yet.

I'm using a database of US zip codes found here, and a very dirty perl script. The script grabs the latitude and longitude of two zip codes from the database, and prints the mileage between them as calculated using the Haversine formula for great circle distance.

It runs like this:

Christophers-MacBook-Pro:scripts chris$ 95134 60614
1837 miles
Christophers-MacBook-Pro:scripts chris$

The script:

use GIS::Distance;

my $dbfile="/Users/chris/Downloads/zipcode.csv";
my $lat1,$lon1,$lat2,$lon2;

sub usage{
  printf "Usage: $0 <zipcode> <zipcode>\n";

if (@ARGV != 2) {usage;};
unless ($ARGV[0] =~ /[0-9]{5}/) {usage;}
unless ($ARGV[1] =~ /[0-9]{5}/) {usage;}

my @sorted = sort @ARGV;

open(DB, '<', $dbfile);

FIRST: while (<DB>) {
  if ($_ =~ /^.$sorted[0]/) {
    (my $trash,my $trash,my $trash,$lat1,$lon1)=split(",",$_);
    my $i;
    ($i) = $lat1 =~ /"([^"]*)"/; $lat1 = $i;
    ($i) = $lon1 =~ /"([^"]*)"/; $lon1 = $i;
    last FIRST;

SECOND: while (<DB>) {
  if ($_ =~ /^.$sorted[1]/) {
    (my $trash,my $trash,my $trash,$lat2,$lon2)=split(",",$_);
    my $i;
    ($i) = $lat2 =~ /"([^"]*)"/; $lat2 = $i;
    ($i) = $lon2 =~ /"([^"]*)"/; $lon2 = $i;
    last SECOND;

if ("$lat1" == "" || "$lat2" == "") {
  printf "Unknown distance\n";
  exit 1;

my $gis = GIS::Distance->new();
my $distance = $gis->distance( $lat1,$lon1 => $lat2,$lon2 );
printf ("%d miles\n",$distance->miles());

Monday, July 22, 2013

Network Toolkit

My case full of network doodads always generates lots of questions when people see it for the first time. I don't carry dedicated iPhone chargers anymore, but Apple cube chargers forgotten behind hotel nightstands is where this started.

With this kit it is immediately apparent when something is missing, so things tend to not get left behind.

The limited space has driven me to find the best and most compact solutions to all of my problems. I'm really pleased with everything that's in here. I'm also aware that it's super nerdy.

The case itself is a Duluu Essential case for iPad. It's a nice semi-rigid clamshell type case. I've made two modifications:

  1. Removed the padded "page" between the two halves. This thing was intended to keep the stuff in the pockets on the left from scratching the iPad on the right. It also served as an iPad stand.
  2. I removed the original zipper pulls, replaced them with a repair part because the square corners of the original pulls tended to cause problems.
On the right side of the case I've installed a bit of floor padding foam (this kind of thing, but mine came from Harbor Freight Tools), with cutouts to hold all of my stuff. Cutting the foam is tedious, requires a very sharp blade.

So, what do I carry with me?

  1. Industrial Sharpies - The "industrial" version because the hyperbolic red label always cracks me up.
  2. Apple Video adapter, cheap PL2303-based serial adapter
  3. Silver Sharpie
  4. 6" micro and mini USB cables, earplugs
  5. Dell PU705 blutooth mouse - There is nothing special about this mouse, except that it uses two batteries, rather than one. Those batteries are AAA nimh batteries inside AAA to AA converter sleeves so that they interchange with the batteries in item 16 and can always remain with their charge/discharge buddy. A note about NiMH batteries: It's important to get low self-discharge batteries (sometimes marketed as "pre-charged") in applications involving long periods of disuse. These type of cells tend have a lower capacity than traditional NiMH cells.
  6. Aluminum Pill bottle - Full of pain killers. Network pain. It's a nice bottle with a threaded cap and an O-ring. Amazon seems to have screwed up the product image. The linked product is what I bought, even though the picture looks different right now.
  7. Apple MagSafe 2 Adapter
  8. NEMA 5-15R to IEC 60320-C14 Adapter - I shaved off the plastic lump opposite the NEMA receptacle's ground conductor so that it fits in the case better. Lets me charge my laptop in a server rack.
  9. Really cheap USB to Ethernet adapters from eBay. USB Device ID 0x9700 0x0fe6. I'm using the driver from here in OS X, hope it's not full of malware.
  10. Fenix LD01R2 flashlight - Surprisingly bright, contains a single alkaline AAA, so that in a pinch, I can harvest a battery from either the mouse or the serial adapter.
  11. Very short Cisco console cable. Thumbscrews replaced with nuts so that it attaches to my RS232 adapter.
  12. Weibetech Mouse Jiggler (slow version) - Looks like a USB flash drive, really it's a screen-lock defeater. It has saved the day countless times. Computers think it's a USB mouse. It moves the cursor imperceptibly, perhaps 1 pixel of movement per minute. Good for watching webinars, doing presentations, etc... Also, makes you show up as "active" in IM applications. The "fast" version is useless, except as a prank. Crazymouse!
  13. Picquick Multique - Surprisingly high quality compact screwdriver kit, made in Canada.
  14. Olfa SVR-2 - My favorite pocket knife. It's cheap, it's always sharp, it's made of stainless steel, and it has the coolest auto-locking mechanism. This knife (with one of their "ultrasharp" series replacement blades) cut the foam.
  15. New Trent IMP52D USB Battery Pack - For some reason this has a (useless) LED flashlight and a red laser pointer. 5200MAh. I've recently purchased a slightly smaller RAVPower 5600MAh unit to replace it.
  16. Schweitzer SEL 2924 Bluetooth Serial Adapter - I like this one better than my other Bluetooth serial adapter because this one runs on AAA batteries (like the other stuff here), charges AAA batteries (when the mouse batteries die, I swap them into this guy for charging), has dip switches for configuring baud rate, and, while the threaded hardware is backwards (thumbscrews), the DE-9 connector has the correct gender for my application. The charge port is micro-usb. The folks at SEL are nice, but a little weird about documentation. They seem to be afraid that terrorists will be downloading product manuals. I tried, failed to take it apart. SEL engineering reports that the case halves have ultrasonic welded seams :(
  17. Apple Thunderbolt to Gigabit Ethernet adapters - Two of them. Lots of NICs supports packet capture for diagnostic and performance purposes.
  18. Tiny USB flash drive
  19. MicroSD reader
  20. Apple Lightning to MicroUSB adapter
What do you carry around? Is there anything I should consider adding?

The mouse gets used so frequently that I may relegate it to a side pocket in my backpack. This would save me the hassle of pulling out the case so often, and free up some real estate. On the other hand, I admit that I get some satisfaction from plunking this case down on a conference table, because people always want to check it out.

Thursday, May 2, 2013

When CDP doesn't discover

Marko's myths of VLAN 1 post continues to drive a lot of traffic to my response about CDP, tagging and the magic properties of VLAN 1.

Today I was introduced to a related phenomenon that I found interesting.

CDP messages sent by switches are always in VLAN 1. If something other than VLAN 1 is the native VLAN on a particular trunk, then the CDP frame will be tagged with "1".

Funny Business
According to the post linked above there are related conditions that break CDP altogether.

The required elements are:
  1. A switch sending tagged CDP frames, either because it's using something other than VLAN 1 as the native VLAN or it's been configured with vlan dot1q tag native
  2. A router-on-a-stick that does NOT have a subinterface configured with encapsulation dot1q 1
Apparently the router, having no subinterface configured to receive frames tagged with "1", will toss incoming CDP frames without bothering to look inside to find the CDP message, killing CDP operation altogether. Bummer. And kind of unexpected.

I haven't tested this behavior, nor do I even have an IOS-XR (where the problem was found) box available. I suspect that IOS and IOS-XE systems might have a similar problem because I recently discovered that IOS-XE subinterfaces using VLAN 1 automatically appended the native keyword even if I didn't type it. Yes, I configured a shiny new ASR as on VLAN 1. It wasn't my fault, I was integrating with an established L2 topology and required VLAN tagging.

Why did this even come up?
Frankly, I can't fathom why folks obsessively set their native VLANs to something other than the default. I've changed the native VLAN on trunks where I actually needed to pass a particular VLAN without a tag, but never as a matter of default configuration like is so common in our industry.

Why is everyone typing switchport trunk native vlan X everywhere? Is there a good reason? If you're not using VLAN 1, and it's not allowed on the trunk, then why worry about which VLAN would be untagged on the link if it were allowed?

If there's a good reason for obsessively setting the native VLAN, please let me know in the comments. Comments including the phrase "orfg cenpgvpr" (ROT13) will be deleted. 

Tuesday, March 26, 2013

Dealing with Corrupt Opegear Firmware

It was inevitable. Now that I'm proudly compiling my own cellular router firmware, I'm also becoming familiar with the process of recovering from corrupt firmware.

I'm using an Ubuntu VM (described in the previous post) running in my MacBook for recovery purposes.

The Opengear instructions for recovering from bad firmware suggest that holding down the reset button is required, but I find that my router attempts to load firmware from the network no matter what. Maybe that's because I've wiped out my configuration? <- Update: yes, this seems to be the case. I haven't nailed it down exactly, but my router doesn't try to netboot every time.

Here's how I'm using that Ubuntu VM:

Required Packages
sudo apt-get install -y tftpd-hpa dhcp3-server

Recovery Software Image
cd /var/lib/tftpboot
sudo wget

Configure DHCP Service
sudo cp /etc/dhcp/dhcpd.conf /etc/dhcp/dhcpd.conf.orig
cat > /tmp/foo << EOF
option domain-name "";
option subnet-mask;
subnet netmask {
host myopengear {
  hardware ethernet 00:13:C6:xx:xx:xx;
  filename "ACM500x_Recovery.flash";
sudo mv /tmp/foo /etc/dhcp/dhcpd.conf

Configure Static IP
It's rare that my MacBook Ethernet cable is plugged in, so my VMs are typically run in NAT mode. For this task, I'll need to run the VM in bridged mode with a fixed IP.

cat > /tmp/foo << EOF
auto lo
iface lo inet loopback
iface eth0 inet static
sudo mv /tmp/foo /etc/network/interfaces.static
sudo cp /etc/network/interfaces /etc/network/interfaces.dhcp

Switch to Bridged Mode
At this point, I switch the VM's network adapter from NAT to bridged mode using the Virtual Machine->Network Adapter pulldown menu in VMware Fusion.

In Parallels it's Virtual Machine ->Configure->Hardware->Network->Type

Now run the following to complete the change in the VM:
sudo ln -s /etc/network/interfaces.static /etc/network/interfaces
sudo pkill dhclient3
sudo /etc/init.d/network restart
sudo service isc-dhcp-server stop
sudo service isc-dhcp-server start

At this point, I power on the Opengear router while holding down it reset button with a pin. A few seconds later the router collects an IP address via BOOTP, and then the firmware via TFTP.

Hit the The router will be running a web service on port 80. Use that to replace the firmware.

Switch back to NAT mode
Before changing back to NAT mode in the hypervisor, do:

sudo ln -s /etc/network/interfaces.dhcp /etc/network/interfaces
sudo service isc-dhcp-server stop
sudo /etc/init.d/network restart

Thursday, March 21, 2013

Compiling Firmware for Opengear ACM5000

Opengear gave me two ACM5000 units as a part of my attendance at Network Field Day 4 in October of last year. The gift has not influenced my opinion of the company nor their product: I continue to think they're a bunch of amazingly clever people, and that they make the best out-of-band access equipment on the market. I recommend them without hesitation nor reservation.

I've been waiting anxiously for the release of the Custom Development Kit (CDK) based on release 3.6 code, and it's finally out. The README that comes with the CDK is a bit dated, and not super easy to follow, so I'm sharing my notes on rolling custom firmware here.

I started with Ubuntu 12.04.2 Server i386 installed inside VMware Fusion on my MacBook. I pretty much took the defaults, allowing VMware to manage the install for me (how cool is this feature?)

Remote Access
Pretty soon I was looking at an Ubuntu login prompt in the VMware console, I logged in and then did:
sudo apt-get -y update
sudo apt-get -y upgrade
sudo apt-get -y install openssh-server
ifconfig eth0
Now I could log in via SSH, so I was done with the VMware console. Grab the software we need.
sudo apt-get install -y make g++ liblzma-dev
sudo mkdir -p /usr/local/download /usr/local/src
sudo chmod 1777 /usr/local/download /usr/local/src
mkdir /usr/local/download/Opengear-CDK
cd /usr/local/download/Opengear-CDK
MD5 checksums of the files I grabbed:
7b07c8a30413f4013eb9c8deb2787dcb  OpenGear-ACM500x-devkit-20130314.tar.gz
Toolchain Installation
Gzip produces an error as a part of this script, but the tarball hidden inside unrolls cleanly anyway. Weird.
yes "" | sudo sh
CDK Unroll
Unpack the CDK for the ACM5000 series. This process works for the ACM5500 too. I know because I accidentally compiled firmware for a box I don't own :)
cd /usr/local/src
tar zxvf ../download/Opengear-CDK/OpenGear-ACM500x-devkit-20130314.tar.gz
Not required for the fimware build, but I found the following helpful when cross-compiling some other packages for the ACM5000:
sudo ln -s /usr/local/opengear/arm-linux-tools-20080623/arm-linux/ /usr/local/
That's it!
Now we can build a firmware image:
cd /usr/local/src/OpenGear-ACM500x-devkit-20130314
The new firmware image should have appeared here:
$ ls -l ./images/image.bin
-rw-r--r-- 1 ogd ogd 11496469 Mar 20 20:33 ./images/image.bin
If you poke around in the romfs directory you'll find the ACM5000 filesystem, and can drop new files in there, change startup scripts, etc...

Prep local storage for the upgrade
You can use the HTTP interface for firmware upgrade, but I prefer to keep track of what's going on. First, we're going to need some local storage.

Insert a USB stick into the ACM5000. If it's already got a FAT32 filesystem on it, you can skip the partition/format steps. Around my house, you can never predict what filesystem (if any) will be on a storage device.

Partition the USB drive. It's at /dev/sda in my case, but you might want to pick through dmesg output to be sure before running these...
echo ";" | sfdisk /dev/sda
sfdisk -c /dev/sda 1 b
Format the USB drive:
mkdosfs -F 32 -I /dev/sda1
Enable the TFTP service. We don't strictly need TFTP to be enabled, but it's handy because switching it on will cause the ACM5000 to mount the USB stick automatically at boot time, and hey, who doesn't need a TFTP server hanging around?
config --set
The USB stick should now be mounted at /tmp/usbdisk. I like to have an images directory:
mkdir /tmp/usbdisk/images
Now we can scp our new software from the build VM into the ACM5000:
scp images/image.bin root@x.x.x.x:/tmp/usbdisk/images
It's probably a good idea to run a quick MD5 on both the image file on the USB stick and the one on the build workstation, even though checksum validation is part of the flash process. Once you're satisfied that they match, flash the new firmware.

The -i flag means "ignore version warnings" - without it, the ACM5000 might refuse the new firmware. The -k flag means "don't kill processes" - without this one, you won't get to watch the progress because your SSH session will be killed off right away. If the upgrade doesn't go forward, you won't know why.
netflash -i -k /tmp/usbdisk/images/image.bin

Friday, March 8, 2013

Rambling on about IP fragmentation

Fragmentation! Squarely on-topic for this blog, I guess.

An issue on a customer's network had me thinking about IP fragmentation recently, and now I find myself pounding some things that I find interesting about fragmentation into my keyboard.

Where should an oversized datagram be sliced?
RFC791 suggests a scheme by which an IP datagram is sliced up so that the resulting fragments just fit out the constraining interface. This seems sensible, but there are some gotchas:
  • If we fragment a 1500 byte packet to fit into a PPPoE link, we might wind up with 1492 bytes in the first datagram (20 bytes header, 1472 bytes payload) and 28 bytes in the second packet (20 bytes header, 8 bytes payload). This works great until that first fragment tries to transit a GRE tunnel (MTU 1476) further along its path. If the PPPoE router had chopped the datagram in half, both fragments would fit through the GRE tunnel without any problem.
  • Depending on the MTU, we might not be able to make precisely MTU-sized fragments. This is because the fragment offset value in the IP header is expressed in terms of 8-byte chunks. Every IP fragment must have an offset that's a multiple of 8, so creating a 1499 byte fragment of a 1500 byte datagram just isn't possible.

What size is the header?
The "best" size for IP fragments might be to slice them into equal-ish sized chunks (aligned to fit on that 8 byte boundary), but even that is too simple: The IP header on the initial fragment might be a different size from the header applied to subsequent fragments, requiring the initial fragment to carry a smaller data payload than the following fragments. The issue here is IP options: some options must be present in all fragments, others options can be omitted from all but the first fragment. Generally speaking, IP options that are used by transit devices (source routing, for example) are the ones that must be copied into every fragment. You can tell whether an option will be copied every fragment by looking at the first bit (the copy bit) of the ip option number.

Fragments are sent in what order?
You might assume that the fragments of an IP datagram would be sent in order, starting with the fragment at offset zero. And you might be surprised.

Some versions of the Linux kernel send IP fragments in reverse order. The idea here is to effect a performance optimization (on other Linux systems, presumably). There are two facets to this strategy:
  1. Only the last fragment (the one with the MF bit set to zero) can tell you how big the whole packet is. Earlier fragments only tell you that "more" is coming, but you can't guess how much. By receiving the last fragment first, the receiver is able to optimize memory allocation for all fragments that comprise the datagram.
  2. Somehow it's easier to line the packets up and then copy them into memory when they arrive in this order. It's got something to do with not having to identify the end of the byte string, but rather jamming incoming payload right at the head of data we've already got. #NotAProgrammer
Fragments may be a nightmare at your security perimeter.
Non-first fragments can't be recognized by L4 and higher means because the L4 header is only present in the first fragment. No surprise there.

Modern firewalls mostly have this figured out, but router ACLs certainly don't, and even devices which can do fragment reassembly for inspection purposes might require the fragments to arrive in order (fragguard feature on PIX).

There's also an interesting intrusion detection avoidance technique that involves sending fragments with dissimilar overlapping data:
  • A fragment containing bytes 0-5 might say: attach
  • A subsequent fragment containing bytes 5-6 might say: k!
So with byte #5 appearing twice, did the IDS see attach! or attack! ?
What about the target system? Linux, incidentally, will interpret this overlap as attach!

It probably doesn't matter. Overlapping-but-dissimilar data is probably enough of a reason to kill these packets before they reach their destination, and that's what most security devices will do if they notice this sort of nonsense. Some might even make the unfortunate decision to kill the associated TCP flow or UDP pseudoflow altogether.

It's unfortunate because mismatched-and-overlapped IP fragments are not necessarily the result of malicious intent. There's only a 16-bit IP ID space, so a big server talking to lots of clients can wrap these numbers pretty quickly, and packets transiting different network paths might be fragmented into different sized chunks. It's not likely, but an attempt to reassemble fragments of different IP packets bearing the same IP ID number is a real possibility.

Fragmentation Analysis
I found myself wondering about the fragments I saw at my customer's edge recently, so I banged together a little script to visualize the packets arriving at the edge. Basically, I'm plotting fragment size vs. time, and color coding things so that I can recognize fragments from whole packets, and to pick out fragments which have arrived out of order.

The script produces some visual output. Rather than describing it, I've made a video.

Monday, March 4, 2013

What did West Virginia officials buy exactly?

The state of West Virginia has landed in the headlines a couple of times recently due to their purchase of wildly inappropriate Cisco routers for over 1000 locations around the state. There have been no shortage of commenters laying the blame at Cisco's feet, but the auditor's report disagrees. It's back in the headlines now because of the recent release of the report, and because of Cisco's pledge to buy the gear back and help the state buy more appropriate equipment.

The auditor's report is an interesting read. Here are some of the highlights from my reading of it:

3945 routers were selected because of two key requirements communicated by the state:

  • Dual internal power supplies, which narrows the selection to 3925 and 3945 models
  • Use of two service module slots on day one, and a desire to have the option to grow beyond these two at some point in the future.

Unfortunately, these two requirements seem to have been communicated in a meeting, because no emails were available to corroborate them, but members of the team who purchased the gear acknowledge that the requirement did originate with the state. So, that's how they wound up in a 3945 chassis.

No analysis was done to understand the user base or  network requirements of any of the sites. In one interesting example, 77 of these routers were delivered to the State Police, who have only made use of 2 of them. For the most part, the State Police, which is one of the bigger network users in the program, is getting by on smaller routers from a previous generation of Cisco Integrated Service Routers. The two routers that have been deployed by the WVSP were upgraded to meet the WVSP network's requirements. The rest languish in storage.

There's at least one case where one of the routers seems to have cost more than the buiding into which it's been deployed:
There's a $20,000 router in here
The State's Broadband Technology Opportunity Program (BTOP) Grant Implementation Team
 (GIT) decided that rather than doing any analysis, they'd deploy the same device everywhere. So, instead of sending a $300-$400 router into that one room library, they sent the following to every site:

  • An $8000 router bundle (3945). About 20% of this amount is for unused voice features.
  • $750 in power supply upgrades. Also voice related.
  • $7200 for five years of 8x5xNBD support
  • A $500 "Data" routing license. Not completely out of line, but probably not required and would have cost less on a smaller router.
  • A $2600 internal 16-port layer 3 switch. Why L3 feature set on the switch? Crazy.
  • $1000 dual T1 card
  • $1600 Network Analysis Module (sniffer type thing - unused, I'm sure)
According to the auditor's report, the state's Chief Technology Officer said that her team "decided to have all routers identically equipped." Frankly, I think this is the most damning phrase in the whole report.

How some have come to the conclusion that this debacle is Cisco's fault, I can't imagine. Anyone who has ever participated in a bid RFQ/RFP process knows that bidders don't have the opportunity to shape the requirements much. I've many times wanted clarification on some seemingly contradictory requirements, or to point out that a little design change could reduce the cost dramatically, but the opportunity to have these discussions just doesn't exist in these formal processes, especially when dealing with government agencies.

The State's Auditor clearly lays the blame at the feet of the folks who made the purchase, not the vendor:

the ultimate cause of the state purchasing inappropriately sized routers is that neither a capacity study nor a user need study was conducted.
Having said that, the auditor also said:
The Legislative Auditor believes that the Cisco sales representatives and engineers had a moral responsibility to propose a plan which reasonably complied with Cisco’s own engineering standards. It is the opinion of the Legislative Auditor that the Cisco representatives showed a wanton indifference to the interests of the public in recommending using $24 million of public funds to purchase 1,164 Cisco model 3945 branch routers. 
Moral responsibility? Wanton indeference? Maybe, but these sort of qualitative judgements aren't too useful when it comes to matters of business in my opinion. Furthermore, Cisco wasn't in a position to recommend an alternative platform because no analysis of the actual requirements had been completed. The fact remains that the requirements which led to the selection of both the platform and the options installed were chosen by members of the GIT, and Cisco wasn't in a position to challenge them.

If you've read more than a few of my blog posts, you'll know that I'm passionate about understanding the Cisco product catalog and the ways to get the most from it. Unfortunately, I've over-spent my customers money (at their insistence!) enough times that I would probably have protested a little and then wound up making that same sale, though the fact that we're talking about public funds might have elicited more protest from me than usual. Unfortunately, I've learned that you can't talk fiscal responsibility into many people if (a) they're spending their employer's money and (b) responsibility would require them to do work.

So, what did they buy exactly? Working backward from the purchase order, I've reconstructed what the configset probably looked like.

Tuesday, January 1, 2013

Offline Cable Management

Full disclosure: I got some stuff for free. Details.

In The Old Days
Cisco Catalysts used to be offered with RJ21 (Amphenol) connectors, rather than individual 8P8C jacks. Installations using this type of switch always stayed nice and clean regardless of the port density because the inevitable tangled mess of cable developed in a different rack, far away from failed fan trays, line cards, power supplies, etc...

I'm not sure why, but Cisco stopped offering line cards with RJ21 interfaces. It doesn't seem like this needed to happen: 1000BASE-T requires the same type of cable (Category 5) as 100BASE-TX, and Cisco demonstrated that the port density required for 48 gigabit ports is possible.

I worked in one environment where the tradition of remote patching continued on gigabit gear through the use of 25-pair cables terminated with six individual 8P8C connectors. Whenever a new switch or line card got installed, it was immediately populated with eight of these multi-headed copper cables. They terminated in a very large 110 block patch area. It worked well, but the Plug Pack is better.

Six Pack Rings For Network Cables
Panduit's Plug Pack modules keep your cables nicely collated, especially when a component is removed for maintenance. A squeeze of those large levers depresses each cable's retention clip, all at the same time. All of the cables maintain their relative position and can be plugged back into the switch all at once when maintenance is complete, eliminating cabling mistakes.

Plug Packs come in 6, 8 and 12 cable models, depending on the layout of the front panel of your switch:
48 port switch layout that
requires 8 position Plug Packs
48 port switch layout that
requires 6 or 12 position Plug Packs

Plug Packs can be ordered alone, or as an assembled system with the size, length, wiring, termination (of the other end) and similar details specified by the customer. I suspect that a common configuration is to order large assemblies with Plug Pack on one end and individual female jacks on the other, ready to be snapped into a patch panel.

From the ordering guide

How It Works
The plugs are retained within the Plug Pack by a single small barb on the side of each cable position and a step on the bottom (pin side) of each plug. The barb  catches on the back edge of the 8P8C connector, preventing the cable from being withdrawn from the Plug Pack. It can be released by poking a small tool along the side of the cable. I've been using a chopstick for this purpose :)

The step on the bottom of the plug mates with a step at the front edge of the Plug Pack. It prevents the cables from protruding too far. You insert a cable from the back and it clicks into place. If the Plug Pack is already installed on a switch, the cable will click into the Plug Pack and the switch port at pretty much the same instant as you push it into place.

Note the barb on the right side of this cavity,
and the step at the far end of its floor.
Note the anti-snag retention clip, and the shoulder
along the bottom edge of this Panduit 8P8C connector

3rd Party Cables
Naturally, Panduit only guarantees that their own cables will interoperate with Plug Packs. I dug around in my bin of cables, and found many that fit the Plug Pack, but not all of them. Some cables don't have that step along the bottom edge, and the strain relief on many would prevent the barb from engaging.

The relevant dimensions of the plugs are:

  • 0.853" long, measured along the side of the plug, not including the small protrusion in the center
  • 0.325" from the edge of the step to the back of the plug

Cables that looked like they would work, because they had the step and no boot, were all dimensioned correctly and clicked into the Plug Pack with no problem. They were fine until I tried to remove them from the assembly, when their snag-prone retaining clips became a problem.

Other Goodies
Panduit offers a couple of nifty accessories to go with the Plug Pack: a removal tool and a "locking" wedge.

The removal tool allows for removal of an individual cable from the assembly, even while the assembly is in use, by simultaneously releasing the retention barb and depressing the lever to release the plug from the switch port. It relies on the unique anti-snag retention clip found on Panduit cable ends, and it certainly beats my chopstick!

The locking wedge fits under the release lever, and can be secured with a nylon tie. It prevents the lever from being depressed, and also prevents removal of individual cables.

Removal ToolLock-in thing

I don't have a removal tool, nor the wedge, and have never seen either of these parts.

The cables I'm using with my Plug Packs are Panduit's Small Diameter (SD) Category 6 cables. They're remarkably tiny, quite a bit smaller than any of the other Category 5 or better cable I've ever seen.

Comparison of Cat6 and Cat6a Small Diameter (SD) cables vs. typical cables.

While these are certainly nice cables, I'm a little fuzzy on who should be buying a Category 6 cable. What's the use case for these things? Surely nobody would buy Category 6 with TSB-155 in mind, so what's the point? For that matter, what's the point of Category 5e? Maybe there's a non-network use (video?) for these cable types.

Update: 2014-02-06: I recently learned that 802.3at (PoE Plus) requires Cat5e cable. So, that explains the point of Cat5e these days.

Use At Home
Thanks to Panduit's generosity, I'm using Plug Packs at home. My use case is probably not what was intended when the product was created, because I'm not dealing with super high density cable management.

All of the major Cisco training vendors expect students to be working on a rack of equipment including four switches with 12-18 links between those switches, but they don't agree on which ports connect between the various devices. Switching from one vendor's workbook to another requires re-cabling of the switch interconnects. Before I got the Plug Packs, this was a tedious and error-prone job. Now I just pull the cables off the switches and set them aside. The Plug Packs keep my physical topology intact, even when it's not connected to the switches!

Four switches cabled up according to the IP Expert topology Click!
I use different colored Plug Packs for each training vendor, so re-cabling the topology is no problem now. It only takes a few seconds to set one bundle of cable aside and snap in a new one.

Full Disclosure
I liked this product based on seeing it in the catalog and on display at a trade show. Its potential for streamlining hardware maintenance and moving messy cabling away from the switch was obvious. Less obvious (until recently) was how I'd be able to use it in my lab. I wanted to blog about it, but I'd never used a Plug Pack, or even held one in my hand.

So, I contacted Panduit, and asked for samples. I explained that I wanted to use Plug Packs in a personal project, and that I wanted to write about it here. I gave them my wishlist, and a few days later a box of goodies arrived.

This is not the first time I've written about a Panduit product, nor is it the first time they've given me a product sample: They sent me a CDE-2 sample for one of my customers after I wrote about overheating Nexus 2xxx.