Tuesday, March 26, 2013

Dealing with Corrupt Opegear Firmware

It was inevitable. Now that I'm proudly compiling my own cellular router firmware, I'm also becoming familiar with the process of recovering from corrupt firmware.

I'm using an Ubuntu VM (described in the previous post) running in my MacBook for recovery purposes.

The Opengear instructions for recovering from bad firmware suggest that holding down the reset button is required, but I find that my router attempts to load firmware from the network no matter what. Maybe that's because I've wiped out my configuration? <- Update: yes, this seems to be the case. I haven't nailed it down exactly, but my router doesn't try to netboot every time.

Here's how I'm using that Ubuntu VM:

Required Packages
sudo apt-get install -y tftpd-hpa dhcp3-server

Recovery Software Image
cd /var/lib/tftpboot
sudo wget ftp://ftp.opengear.com/release/recovery/ACM500x_Recovery.flash

Configure DHCP Service
sudo cp /etc/dhcp/dhcpd.conf /etc/dhcp/dhcpd.conf.orig
cat > /tmp/foo << EOF
option domain-name "opengear-recovery.com";
option subnet-mask;
subnet netmask {
host myopengear {
  hardware ethernet 00:13:C6:xx:xx:xx;
  filename "ACM500x_Recovery.flash";
sudo mv /tmp/foo /etc/dhcp/dhcpd.conf

Configure Static IP
It's rare that my MacBook Ethernet cable is plugged in, so my VMs are typically run in NAT mode. For this task, I'll need to run the VM in bridged mode with a fixed IP.

cat > /tmp/foo << EOF
auto lo
iface lo inet loopback
iface eth0 inet static
sudo mv /tmp/foo /etc/network/interfaces.static
sudo cp /etc/network/interfaces /etc/network/interfaces.dhcp

Switch to Bridged Mode
At this point, I switch the VM's network adapter from NAT to bridged mode using the Virtual Machine->Network Adapter pulldown menu in VMware Fusion.

In Parallels it's Virtual Machine ->Configure->Hardware->Network->Type

Now run the following to complete the change in the VM:
sudo ln -s /etc/network/interfaces.static /etc/network/interfaces
sudo pkill dhclient3
sudo /etc/init.d/network restart
sudo service isc-dhcp-server stop
sudo service isc-dhcp-server start

At this point, I power on the Opengear router while holding down it reset button with a pin. A few seconds later the router collects an IP address via BOOTP, and then the firmware via TFTP.

Hit the The router will be running a web service on port 80. Use that to replace the firmware.

Switch back to NAT mode
Before changing back to NAT mode in the hypervisor, do:

sudo ln -s /etc/network/interfaces.dhcp /etc/network/interfaces
sudo service isc-dhcp-server stop
sudo /etc/init.d/network restart

Thursday, March 21, 2013

Compiling Firmware for Opengear ACM5000

Opengear gave me two ACM5000 units as a part of my attendance at Network Field Day 4 in October of last year. The gift has not influenced my opinion of the company nor their product: I continue to think they're a bunch of amazingly clever people, and that they make the best out-of-band access equipment on the market. I recommend them without hesitation nor reservation.

I've been waiting anxiously for the release of the Custom Development Kit (CDK) based on release 3.6 code, and it's finally out. The README that comes with the CDK is a bit dated, and not super easy to follow, so I'm sharing my notes on rolling custom firmware here.

I started with Ubuntu 12.04.2 Server i386 installed inside VMware Fusion on my MacBook. I pretty much took the defaults, allowing VMware to manage the install for me (how cool is this feature?)

Remote Access
Pretty soon I was looking at an Ubuntu login prompt in the VMware console, I logged in and then did:
sudo apt-get -y update
sudo apt-get -y upgrade
sudo apt-get -y install openssh-server
ifconfig eth0
Now I could log in via SSH, so I was done with the VMware console. Grab the software we need.
sudo apt-get install -y make g++ liblzma-dev
sudo mkdir -p /usr/local/download /usr/local/src
sudo chmod 1777 /usr/local/download /usr/local/src
mkdir /usr/local/download/Opengear-CDK
cd /usr/local/download/Opengear-CDK
wget ftp://ftp.opengear.com/cdk/OpenGear-ACM500x-devkit-20130314.tar.gz
wget ftp://ftp.opengear.com/cdk/tools/arm-linux-tools-20080623.sh
MD5 checksums of the files I grabbed:
040b2318025adcd956b6bb836791a107  arm-linux-tools-20080623.sh
7b07c8a30413f4013eb9c8deb2787dcb  OpenGear-ACM500x-devkit-20130314.tar.gz
Toolchain Installation
Gzip produces an error as a part of this script, but the tarball hidden inside unrolls cleanly anyway. Weird.
yes "" | sudo sh arm-linux-tools-20080623.sh
CDK Unroll
Unpack the CDK for the ACM5000 series. This process works for the ACM5500 too. I know because I accidentally compiled firmware for a box I don't own :)
cd /usr/local/src
tar zxvf ../download/Opengear-CDK/OpenGear-ACM500x-devkit-20130314.tar.gz
Not required for the fimware build, but I found the following helpful when cross-compiling some other packages for the ACM5000:
sudo ln -s /usr/local/opengear/arm-linux-tools-20080623/arm-linux/ /usr/local/
That's it!
Now we can build a firmware image:
cd /usr/local/src/OpenGear-ACM500x-devkit-20130314
The new firmware image should have appeared here:
$ ls -l ./images/image.bin
-rw-r--r-- 1 ogd ogd 11496469 Mar 20 20:33 ./images/image.bin
If you poke around in the romfs directory you'll find the ACM5000 filesystem, and can drop new files in there, change startup scripts, etc...

Prep local storage for the upgrade
You can use the HTTP interface for firmware upgrade, but I prefer to keep track of what's going on. First, we're going to need some local storage.

Insert a USB stick into the ACM5000. If it's already got a FAT32 filesystem on it, you can skip the partition/format steps. Around my house, you can never predict what filesystem (if any) will be on a storage device.

Partition the USB drive. It's at /dev/sda in my case, but you might want to pick through dmesg output to be sure before running these...
echo ";" | sfdisk /dev/sda
sfdisk -c /dev/sda 1 b
Format the USB drive:
mkdosfs -F 32 -I /dev/sda1
Enable the TFTP service. We don't strictly need TFTP to be enabled, but it's handy because switching it on will cause the ACM5000 to mount the USB stick automatically at boot time, and hey, who doesn't need a TFTP server hanging around?
config --set config.services.tftp.enabled=on
The USB stick should now be mounted at /tmp/usbdisk. I like to have an images directory:
mkdir /tmp/usbdisk/images
Now we can scp our new software from the build VM into the ACM5000:
scp images/image.bin root@x.x.x.x:/tmp/usbdisk/images
It's probably a good idea to run a quick MD5 on both the image file on the USB stick and the one on the build workstation, even though checksum validation is part of the flash process. Once you're satisfied that they match, flash the new firmware.

The -i flag means "ignore version warnings" - without it, the ACM5000 might refuse the new firmware. The -k flag means "don't kill processes" - without this one, you won't get to watch the progress because your SSH session will be killed off right away. If the upgrade doesn't go forward, you won't know why.
netflash -i -k /tmp/usbdisk/images/image.bin

Friday, March 8, 2013

Rambling on about IP fragmentation

Fragmentation! Squarely on-topic for this blog, I guess.

An issue on a customer's network had me thinking about IP fragmentation recently, and now I find myself pounding some things that I find interesting about fragmentation into my keyboard.

Where should an oversized datagram be sliced?
RFC791 suggests a scheme by which an IP datagram is sliced up so that the resulting fragments just fit out the constraining interface. This seems sensible, but there are some gotchas:
  • If we fragment a 1500 byte packet to fit into a PPPoE link, we might wind up with 1492 bytes in the first datagram (20 bytes header, 1472 bytes payload) and 28 bytes in the second packet (20 bytes header, 8 bytes payload). This works great until that first fragment tries to transit a GRE tunnel (MTU 1476) further along its path. If the PPPoE router had chopped the datagram in half, both fragments would fit through the GRE tunnel without any problem.
  • Depending on the MTU, we might not be able to make precisely MTU-sized fragments. This is because the fragment offset value in the IP header is expressed in terms of 8-byte chunks. Every IP fragment must have an offset that's a multiple of 8, so creating a 1499 byte fragment of a 1500 byte datagram just isn't possible.

What size is the header?
The "best" size for IP fragments might be to slice them into equal-ish sized chunks (aligned to fit on that 8 byte boundary), but even that is too simple: The IP header on the initial fragment might be a different size from the header applied to subsequent fragments, requiring the initial fragment to carry a smaller data payload than the following fragments. The issue here is IP options: some options must be present in all fragments, others options can be omitted from all but the first fragment. Generally speaking, IP options that are used by transit devices (source routing, for example) are the ones that must be copied into every fragment. You can tell whether an option will be copied every fragment by looking at the first bit (the copy bit) of the ip option number.

Fragments are sent in what order?
You might assume that the fragments of an IP datagram would be sent in order, starting with the fragment at offset zero. And you might be surprised.

Some versions of the Linux kernel send IP fragments in reverse order. The idea here is to effect a performance optimization (on other Linux systems, presumably). There are two facets to this strategy:
  1. Only the last fragment (the one with the MF bit set to zero) can tell you how big the whole packet is. Earlier fragments only tell you that "more" is coming, but you can't guess how much. By receiving the last fragment first, the receiver is able to optimize memory allocation for all fragments that comprise the datagram.
  2. Somehow it's easier to line the packets up and then copy them into memory when they arrive in this order. It's got something to do with not having to identify the end of the byte string, but rather jamming incoming payload right at the head of data we've already got. #NotAProgrammer
Fragments may be a nightmare at your security perimeter.
Non-first fragments can't be recognized by L4 and higher means because the L4 header is only present in the first fragment. No surprise there.

Modern firewalls mostly have this figured out, but router ACLs certainly don't, and even devices which can do fragment reassembly for inspection purposes might require the fragments to arrive in order (fragguard feature on PIX).

There's also an interesting intrusion detection avoidance technique that involves sending fragments with dissimilar overlapping data:
  • A fragment containing bytes 0-5 might say: attach
  • A subsequent fragment containing bytes 5-6 might say: k!
So with byte #5 appearing twice, did the IDS see attach! or attack! ?
What about the target system? Linux, incidentally, will interpret this overlap as attach!

It probably doesn't matter. Overlapping-but-dissimilar data is probably enough of a reason to kill these packets before they reach their destination, and that's what most security devices will do if they notice this sort of nonsense. Some might even make the unfortunate decision to kill the associated TCP flow or UDP pseudoflow altogether.

It's unfortunate because mismatched-and-overlapped IP fragments are not necessarily the result of malicious intent. There's only a 16-bit IP ID space, so a big server talking to lots of clients can wrap these numbers pretty quickly, and packets transiting different network paths might be fragmented into different sized chunks. It's not likely, but an attempt to reassemble fragments of different IP packets bearing the same IP ID number is a real possibility.

Fragmentation Analysis
I found myself wondering about the fragments I saw at my customer's edge recently, so I banged together a little script to visualize the packets arriving at the edge. Basically, I'm plotting fragment size vs. time, and color coding things so that I can recognize fragments from whole packets, and to pick out fragments which have arrived out of order.

The script produces some visual output. Rather than describing it, I've made a video.

Monday, March 4, 2013

What did West Virginia officials buy exactly?

The state of West Virginia has landed in the headlines a couple of times recently due to their purchase of wildly inappropriate Cisco routers for over 1000 locations around the state. There have been no shortage of commenters laying the blame at Cisco's feet, but the auditor's report disagrees. It's back in the headlines now because of the recent release of the report, and because of Cisco's pledge to buy the gear back and help the state buy more appropriate equipment.

The auditor's report is an interesting read. Here are some of the highlights from my reading of it:

3945 routers were selected because of two key requirements communicated by the state:

  • Dual internal power supplies, which narrows the selection to 3925 and 3945 models
  • Use of two service module slots on day one, and a desire to have the option to grow beyond these two at some point in the future.

Unfortunately, these two requirements seem to have been communicated in a meeting, because no emails were available to corroborate them, but members of the team who purchased the gear acknowledge that the requirement did originate with the state. So, that's how they wound up in a 3945 chassis.

No analysis was done to understand the user base or  network requirements of any of the sites. In one interesting example, 77 of these routers were delivered to the State Police, who have only made use of 2 of them. For the most part, the State Police, which is one of the bigger network users in the program, is getting by on smaller routers from a previous generation of Cisco Integrated Service Routers. The two routers that have been deployed by the WVSP were upgraded to meet the WVSP network's requirements. The rest languish in storage.

There's at least one case where one of the routers seems to have cost more than the buiding into which it's been deployed:
There's a $20,000 router in here
The State's Broadband Technology Opportunity Program (BTOP) Grant Implementation Team
 (GIT) decided that rather than doing any analysis, they'd deploy the same device everywhere. So, instead of sending a $300-$400 router into that one room library, they sent the following to every site:

  • An $8000 router bundle (3945). About 20% of this amount is for unused voice features.
  • $750 in power supply upgrades. Also voice related.
  • $7200 for five years of 8x5xNBD support
  • A $500 "Data" routing license. Not completely out of line, but probably not required and would have cost less on a smaller router.
  • A $2600 internal 16-port layer 3 switch. Why L3 feature set on the switch? Crazy.
  • $1000 dual T1 card
  • $1600 Network Analysis Module (sniffer type thing - unused, I'm sure)
According to the auditor's report, the state's Chief Technology Officer said that her team "decided to have all routers identically equipped." Frankly, I think this is the most damning phrase in the whole report.

How some have come to the conclusion that this debacle is Cisco's fault, I can't imagine. Anyone who has ever participated in a bid RFQ/RFP process knows that bidders don't have the opportunity to shape the requirements much. I've many times wanted clarification on some seemingly contradictory requirements, or to point out that a little design change could reduce the cost dramatically, but the opportunity to have these discussions just doesn't exist in these formal processes, especially when dealing with government agencies.

The State's Auditor clearly lays the blame at the feet of the folks who made the purchase, not the vendor:

the ultimate cause of the state purchasing inappropriately sized routers is that neither a capacity study nor a user need study was conducted.
Having said that, the auditor also said:
The Legislative Auditor believes that the Cisco sales representatives and engineers had a moral responsibility to propose a plan which reasonably complied with Cisco’s own engineering standards. It is the opinion of the Legislative Auditor that the Cisco representatives showed a wanton indifference to the interests of the public in recommending using $24 million of public funds to purchase 1,164 Cisco model 3945 branch routers. 
Moral responsibility? Wanton indeference? Maybe, but these sort of qualitative judgements aren't too useful when it comes to matters of business in my opinion. Furthermore, Cisco wasn't in a position to recommend an alternative platform because no analysis of the actual requirements had been completed. The fact remains that the requirements which led to the selection of both the platform and the options installed were chosen by members of the GIT, and Cisco wasn't in a position to challenge them.

If you've read more than a few of my blog posts, you'll know that I'm passionate about understanding the Cisco product catalog and the ways to get the most from it. Unfortunately, I've over-spent my customers money (at their insistence!) enough times that I would probably have protested a little and then wound up making that same sale, though the fact that we're talking about public funds might have elicited more protest from me than usual. Unfortunately, I've learned that you can't talk fiscal responsibility into many people if (a) they're spending their employer's money and (b) responsibility would require them to do work.

So, what did they buy exactly? Working backward from the purchase order, I've reconstructed what the configset probably looked like.