Wednesday, April 22, 2015

Controlling HP Moonshot with ipmitool

I've been driving the HP Moonshot environment over the network with ipmitool, and found it not altogether straightforward. One of the HP engineers told me:
Yeah, we had to jump through some hoops to extend IPMI’s single-system view of the world into our multi-node architecture.
That is exactly why it's confusing. Everything here works reasonably well, but users have to jump through all of the hoops that the product engineers lined up for us.

The build of ipmitool that ships with OS X (2.5b1) doesn't support the Moonshot's double-bridged topology, so I'm using the one that ships with macports (1.8.12). To check whether your version of ipmitool is compatible, run ipmitool -h and look to see whether it supports both the single-bridge (-b, -t) and double-bridge (-B, -T) command line options. If it does, then it's probably okay.

Using IPMI over the network with a regular rack server is pretty straightforward. You specify the device by name or IP, the user credentials and the command/query you want to run. That's about it. Such a command might look like this:

 ipmitool —H <IPMI_IP> -U <user> —P <password> —I lanplus chassis identify force  

The command above turns on the beacon LED on a server. Most of the options here are obvious. The -I lanplus specifies that we intend to speak over the LAN to a remote host, rather than use IPMI features that may be accessible from within the running OS on the machine. I'm not using the -P <password> option in subsequent examples, rather I use -E which specifies to pull the user password from an environment variable.

Moonshot is quite a bit more complicated than a typical rack mount server. Here's a diagram of the topology from the HP iLO Chassis Management IPMI User Guide:
Moonshot IPMI Topology
While the identify command example does work against moonshot, it turns on the chassis beacon LED. There are also beacon LEDs on each cartridge and on each switch. To manipulate those LEDs, we need to bridge the commands through the Zone MC, to the various devices on the IPMB0 bus.

First, let's get an inventory from the perspective of the Zone MC:

 $ ipmitool -H <IPMI_IP> -EU Administrator -I lanplus sdr list all  
 ZoMC            | Static MC @ 20h    | ok  
 254             | Log FRU @FEh f0.60 | ok  
 IPMB0 Phys Link | 0x00               | ok  
 ChasMgmtCtlr1   | Static MC @ 44h    | ok  
 PsMgmtCtlr1     | Dynamic MC @ 52h   | ok  
 PsMgmtCtlr2     | Dynamic MC @ 54h   | ok  
 PsMgmtCtlr3     | Dynamic MC @ 56h   | ok  
 PsMgmtCtlr4     | Dynamic MC @ 58h   | ok  
 CaMC            | Static MC @ 82h    | ok  
 CaMC            | Static MC @ 84h    | ok  
 CaMC            | Static MC @ 86h    | ok  
 Switch MC       | Static MC @ 68h    | ok  
 Switch MC       | Static MC @ 6Ah    | ok  

From this, we can see that the Zone MC, Chassis MC, and first power supply MC are all at the addresses we'd expect based on having reviewed HP's drawing. Additionally, we can see the the addresses of the remaining power supplies, the switches, and the cartridges (I snipped the output after the first three cartridges).

You can learn more about each of those discovered devices with:

 ipmitool -H <IPMI_IP> -EU Administrator -I lanplus fru print  

I've not yet figured out how to relate the cartridge and switch MCs to physical slot numbers other than by flipping on and off the beacon LEDs, or inspecting serial numbers. I think it's supposed to be possible with the picmg addrinfo command, but I've yet to figure out how to relate that output to physical cartridge and switch slots.

Okay, there's one more thing to note in the table above: the IPMB0 bus to which all of our downstream controllers are attached is channel 0x00. We need to know the address here because these controllers potentially have many interfaces. When sending bridged commands, we need to send both the channel number and the target address.

So, now we've got everything we need in order to flip on the beacon LED at cartridge #1:

 ipmitool -H <IPMI_IP> -EU Administrator -I lanplus -b 0 -t 0x82 chassis identify force  

Yes, the command is chassis identify, but it doesn't illuminate the chassis LED. That's because the command is executing within the context of a cartridge controller. The command above should light the LED on cartridge #1.

Cool, so we're now talking through the Zone MC to the individual cartridges, switches and power supplies! But what about the servers? Moonshot supports multiple servers per cartridge, so we're still one hop away. That's why we need double bridging.

Double Bridging
Double bridged commands work the same as single bridged, except that we have to specify the channel number and target address at each of two layers. The first hop is specified with -B and -T, second hop with -b and -t.

First, we need to get the layout of a cartridge controller. We'll run the sdr list all command again, but bridge it through to the cartridge in slot 1:

 $ ipmitool -H <IPMI_IP> -EU Administrator -I lanplus -b 0 -t 0x82 sdr list all  
 01-Front Ambient | 27 degrees C   | ok  
 02-CPU           | 0 degrees C    | ok  
 03-DIMM 1        | 26 degrees C   | ok  
 04-DIMM 2        | 26 degrees C   | ok  
 05-DIMM 3        | 28 degrees C   | ok  
 06-DIMM 4        | 27 degrees C   | ok  
 07-HDD Zone      | 27 degrees C   | ok  
 08-Top Exhaust   | 26 degrees C   | ok  
 09-CPU Exhaust   | 27 degrees C   | ok  
 CaMC             | Static MC @ 82h  | ok  
 SnMC             | Static MC @ 72h  | ok  
 SnMC 1           | Log FRU @01h c1.62 | ok  

This is a single-node cartridge (m300 cartridges are all I've got to play with), but, consistent with quad-node cartridges, they require a bridging hop. The SnMC at 0x72 refers to the lone server on this cartridge. I assume that multi-node cartridges would list several SnMC resources here.

Unfortunately, when the sdr list all command is run against the cartridge controller, it doesn't reveal anything about the downstream transit channel like it did when we ran it against the Zone MC. The channel number we need for the second bridge hop is 7. It's documented in chapter 3 of the HP iLO Chassis Management IPMI User Guide.

So, putting this all together, we'll set node 1 on cartridge 1 to boot from its internal HDD, and then set it to boot just once via PXE:

 $ ipmitool -H <IPMI_IP> -EU Administrator -B 0 -T 0x82 -b 7 -t 0x72 -I lanplus chassis bootdev disk options=persistent  
 $ ipmitool -H <IPMI_IP> -EU Administrator -B 0 -T 0x82 -b 7 -t 0x72 -I lanplus chassis bootdev pxe  

Some Other Useful Commands
Non-bridged commands:

  • lan print 
  • sel list
  • chasis status
  • chassis identify (lights the LED for 15 seconds)
  • chassis identify <duration> (0 turns off the LED)
  • chassis identify force (lights the LED permanently) 

Single-bridged commands:

  • chassis status
  • chassis identify (all variants above)

Double-bridged node commands:

  • chassis power status
  • chassis power on
  • chassis power off
  • chassis power cycle (only works when node power is on)
  • sol activate (connects you to the node console via Serial-Over-LAN)
  • sol deactivate (kills an active sol session)
I've found that the web interface doesn't indicate beacon LED status when IPMI sets it to expire (the default behavior).

Attempts to use the Virtual Serial Port from the iLO command-line fail when an IPMI SOL session is active. The iLO CLI prompts you to "acquire" the session, but this fails too.

Setting node boot and power options too quickly (one after the other) seems to cause them to fail.

Node boot order settings configured via IPMI while node power is off work, but the iLO command line doesn't recognize that they've happened until the node is powered on.

Update 2016/01/25
Version 1.40 of the Moonshot chassis manager firmware introduced the possibility of creating Operator class users who are restricted to viewing/manipulating only a subset of cartridges. This has been handy in a development environment, but there are a couple of gotchas to using IPMI capabilities as an Operator user.

The first gotcha is that the user needs to explicitly declare the intended privilege level specifying -L OPERATOR at the ipmitool command line. I'm not clear on why the privilege level can't be inferred at the chassis manager by looking at the passed credentials, but apparently it cannot.

The second gotcha: By default, the SOL capability requires ADMINISTRATOR class privilege to operate. You can see this by sending the sol info command via ipmitool as an ADMINISTRATOR class user. This requirement seems odd to me: OPERATORs are allowed to interact with the virtual serial port through the SSH interface without any additional configuration.

It is possible to allow OPERATOR users to use the IPMI SOL capability by changing the required privilege level. Do that by sending sol set privilege-level operator via ipmitool with ADMINISTRATOR credentials.

Tuesday, April 14, 2015

The Verizon SuperCookie Won't Go Away

Update 4/21/2015:
It's been pointed out to me that Relevant Mobile Advertising (RMA - the thing responsible for the SuperCookie) and Customer Proprietary Network Information (CPNI) are not the same thing. That may be, but the link in the opt out instructions on Verizon's RMA info page goes to the CPNI settings below. If there's an RMA opt-out lever available to me somewhere on, I sure can't find it. I spoke with a new Verizon phone rep today. She claims to have sorted things out. My HTTP traffic still has the extra header attached. We'll see if that changes in the next few days...
Verizon Wireless made the news a few months ago when somebody noticed that they were adding extra HTTP headers which uniquely identified subscribers to every web request which traversed their network.

There was something of an uproar about it. I checked at least one of my phones, and was disappointed to find the tracking header attached to my traffic.

Then, less than two weeks ago, Verizon announced that customers would be allowed to opt out of having their web requests marked in this way. Many news outlets covered the announcement, Twitter rejoiced, and I headed over to my account's privacy settings page to opt out. I found that my already-paranoid privacy settings looked like this:

Don't Share!
My account was already configured no information sharing (I had set these levers years ago), and when I checked using the browser in my phone, I found that the tracking header had disappeared.

So, maybe it wasn't that there was a new privacy option, so much as it was a case of Verizon acknowledging that uniquely identifying customers to every website they visit just miiiiight constitute sharing of Customer Proprietary Network Information as defined by 47 U.S. Code § 222. With this policy change, customers who'd opted out of CPNI sharing would have their information protected.

Also, the $4.7M smackdown Verizon received for CPNI oversharing a few months prior may have had something to do with their sudden clarity on the matter.

Several US Senators (Sen. Bill Nelson, D-Fla., Sen. Edward Markey, D-Mass., and Sen. Richard Blumenthal, D-Conn.) have now urged the FCC to investigate Verizon's "SuperCookie", and FCC chairman Wheeler has reportedly responded by saying:
"We are looking specifically into carriers’ injection of header information and the collection and use of information about their subscribers’ Internet activity. As you suggest, we will be considering the extent to which our rules and policies relating to consumer privacy, data security and transparency may be implicated,"
So, aside from another possible smackdown, this matter should be settled, right? Folks who want no super cookie should be able to opt out?

Unfortunately, no.

Note my privacy settings above. Well, in spite of them, at least one of those lines is still attaching the damned tracking header (X-UIDH) to my HTTP traffic:
Note the X-UIDH: header line
Here are sanitized capture files from the client and server which demonstrate the problem.

I spent a long time on the phone with Verizon folks today. The CSRs insist that all of my CPNI is safe, and there's nothing to worry about. Yet the injected header persists. My next step in the Verizon escalation chain involves licking envelopes because apparently the real tech wizards at Big Red have neither email nor telephones...

It's crazy that the escalation path through my Senators and the FCC looks easier to manage than the one Verizon is offering me.

Wednesday, April 8, 2015

HP Moonshot - Stuff I Wish I'd Known

Yeah, it looks just like this.
I've been working with HP Moonshot for some months now. It's a neat box with a lot of interesting features. There are plenty of press releases and (possibly paid-for) "reviews" available out there, but not much frank commentary from actual end users, and that's disappointing.

What is it?
Briefly, Moonshot is a miniature blade enclosure. I'm sure there are marketing folks who would like me to use different terminology, but it boils down to servers, Ethernet switches and power all rolled into one box.

There are some key differentiators between this enclosure and some of its larger cousins:

Low Power - The whole package is tuned for high density and low power. There are no monstrously fast multi-socket servers available, but the density is amazing. With 8 cores per node and 180 nodes per chassis we're talking about 300+ cores per rack unit!

Less Redundancy - Unlike the C-class enclosures which sport redundant "Onboard Administrator" modules, Moonshot has a single "Chassis Manager". I do not view this as a problem for two reasons: First, Moonshot is mostly suited for massively horizontally scalable applications which should tolerate failure of a whole chassis. Second, failure of a Chassis Manager module doesn't impact running services. Rather, it becomes impossible to reconfigure things until the fault is repaired.

Less Flexibility - Unlike C-class, which have lots of options in terms of server blade accessories and communications (flex-mumble, many NICs, FC switching, etc...), Moonshot is pretty much fixed-configuration. On my cartridges, the only order-time configurable hardware option is a choice between a handful of hard drive offerings. Until recently, HP didn't even support mixing of server cartridge types within a chassis. The only extra-chassis communication mechanism is via a pair of Ethernet switches. Only three switch models are currently shipping, and there's not much choice involved: Switch selection is largely driven by the cartridge selection, which is driven by your intended workload.

Limited Storage - The only storage option currently available is the single mechanical or flash SSD built into each compute node. Storage blades exist (I've seen them on "Moonshot University" videos), but they don't seem to be shipping products quite yet.

2D Toroid Mesh - Moonshot has a cool cartridge-to-cartridge communications mesh built into the chassis. This mesh is in addition to the Ethernet path between cartridges and switches. My cartridges cannot leverage this feature at all, but I'm sure it's wonderful for the right workloads on the right cartridges.

Low Cost - The list above looks kinda negative. I don't mean for it to be. For all of the fancy stuff you don't get with Moonshot, the pricing is pretty compelling.

My Hardware Configuration
I've got access to the following gear:
  • Moonshot 1500 Chasis w/management module
  • Moonshot m300 server cartridges, each with a single 8-core Atom C2750 SoC
  • Moonshot 45G switches (2 per chassis)
Physical Package
This box is big. Really big. It will fit into a standard 42" deep server cabinet, but only just barely, and only if the cabinet is set up with the front rails/posts in exactly the right spot.

Measuring from the front surface of the cabinet's mounting posts, a single Moonshot chassis protrudes about 2-3/16" toward the cold aisle door, and 38" (but it could be 39" - see below) toward the hot aisle door. The 38" dimension doesn't reflect the length of the chassis, but rather the rearward protrusion of the mounting rails and cable management hardware. The forward mounting posts need to be about 2 1/2" from the front of the cabinet (putting the server faceplate right up against the door mesh) for things to work out in a 42" cabinet.

The rails telescope in the way you'd expect. Their working depth range is 25-3/16" to 34-3/16".

The box is 4.3 RU tall. The rack hardware that comes with moonshot aligns the bottom of the chassis with a rack unit, and puts the "extra" .3RU at the top of the chassis, never at the bottom. HP is proud to tell you that it's possible to get 3 chassis in 13RU, or up to 10 chassis / 1800 compute nodes per cabinet. That tight packaging is only possible with accessory part 681677-B21. For some reason, the accessory isn't available for purchase a-la-carte, but rather only when purchasing a pile of Moonshots along with HP cabinets (!?). HP have told me that the sale restriction on the 13U adapter will be lifted, but I don't know whether it has happened yet. Watch out for that.

The adapter (HP part 681677-B21) is essentially four 22-3/4" long (13 rack units) "C" channels with two sets of mounting holes. You mount these four channels inboard (behind) each of the four mounting posts in the server cabinet. One set of holes is spaced in the usual fashion. These fix the channels to your cabinet. The other set of holes accept the Moonshot rails and are offset so that the Moonshot chassis are packed tight against each other.

13RU Adapter kit makes the whole package about 1" bigger

Because the 13U adapter rails mount inboard of the cabinet posts, they don't impact the location of the chassis in the cabinet at all. The cold side protrusion of the chassis is still about 2-3/16"  They do, however move the server's rails and cable management stuff about an inch toward the hot side of the cabinet because the rails are now attached to the "C" channel, rather than the front of the cabinet. This increases the rearward protrusion from 38" to 39" and makes the whole package a bit over 41" long. It'll fit in a 42" cabinet, but only just.

Management Network
The Chassis Manager module has two 1000BASE-T connectors ("iLO" and "Link") and a serial port (115200,8,n,1). By default, the "Link" connector is disabled. The intention is for you to daisy-chain the the "iLO" and "Link" interfaces of several chassis, hanging them all from a single management switch port. These interfaces lead to an intra-chassis bridge which does not run spanning tree. You can enable the link connector and attach to two management switch ports if desired. The management switch ports will see each other's BPDUs, causing one of them to block with this configuration. You'll reach the Chassis Manager and the switch(es) or switch stack through these cables. The cartridges (mine anyway) do not have individual iLO addresses. It is possible to manage the switches in-band as well, in which case only the Chassis Manager use these cables.

Virtual Serial Ports
The switches and the servers each have a "virtual serial port" which can be accessed from the Chassis Manager, effectively turning the Chassis Manager into a serial terminal server. There are some things to know here:
  • The Chassis Manager might be supporting 180 compute nodes and 2 switches. That's 182 connections at 115200bps each, almost 21Mb/s in aggregate. There's a possibility that it can't keep up and will lose characters here and there.
  • A maximum of 10 SSH VSP sessions are supported at a time. I suspect this is related to the point above.
  • Compute node serial ports are available via IPMI 2.0-sytle Serial Over LAN. As far as I can tell, that is not possible with the switch console VSPs.
  • The switches don't send everything to the CM's serial interface. Some early boot time stuff (including the ability to interrupt boot) goes only to the physical serial port on the switch uplink module.
  • I initially planned to use the VSP console, but have since switched to the physical console port.
Switching OS
The 45G and 180G switches run Broadcom FASTPATH, while the 45XGc runs Comware. I have not had a good experience with the FASTPATH OS on these switches. Apparently lots of customers are running it without issue, so this may be a YMMV thing.

The Moonshot chassis supports up to four power supplies. Some combinations of cartridges and switches require three power supplies for normal operation, and all four power supplies for N+1 redundancy. This is a problem because most server cabinets (and the power distribution equipment upstream of the cabinet) are set up to offer only A/B redundant feeds. There's no way to supply power in a fashion that meets the chassis 3+1 requirement without getting into hokey stuff like automatic transfer switches. I asked HP about this, they recommended an in-cabinet UPS or two. Apparently the recommendation was not a joke.

Cartridge Mix/Match
Until recently, HP would only sell cartridges in bundles of 15, and only supported homogenous cartridge configurations. Mix/match of cartridge types within a chassis was not allowed. For the workload in my environment, it'd be handy to have a small handful of Xeon-based cartridges blended in with the Atom population, and HP tell me that the restriction has been lifted! Cartridges are now available a-la-carte, and we're free to mix/match with the following constraints in mind:
  • Some cartridges have four independent compute nodes onboard. These cartridges present 4 network interfaces to each switch, so they require a 180-port switch (model 180G is the only option at this time).
  • Some single-node cartridges have 10Gb/s-capable NICs. These interfaces will run at 1Gb/s, so they can be used with any of the three currently available switches.
  • The cartridge-facing interfaces on the 10Gb/s 45XGc switch can do 1Gb/s or 10Gb/s, so they'll support any single-node cartridge.
  • HP recommend that cartridges with mechanical drives be installed toward the front of the chassis so that they'll get the coolest airflow.
This box is super noisy when the fans get excited. There's an optical sensor on the switch modules which forces the fans to full speed when the lid is removed. I found that a shiny thing set on top of the switch fools the sensor into believing the lid is in place, and the fans calm down. Probably not great for cooling, but handy if the phone rings while the lid is off of the enclosure.

Uplink Modules
Uplink ports on the 6xSFP+  uplink modules in my chassis operate at 1Gb/s or 10Gb/s, and they do not require vendor-coded transceivers. Several transceivers are officially supported by HP, the most affordable of which come from the Procurve family.

The QSFP-based uplink modules support a cable which breaks out into four SFP+ interfaces. The SFP+ interfaces on these breakout cables do not support 1Gb/s operation.

The switch console ports integrated in the uplink module are pinned like Cisco consoles, and operate at 115200,8,n,1.

The m300 cartridges don't have many knobs or levers. There's no graphical console. It's serial only. The only boot-time prompts are to access the boot ROM menu on the onboard NICs, and the only options there relate to whether you'll see the "press CTRL+S" message, and for how long. That's it.

From the Chassis Manager's web interface (or via IPMI) you can power cartridges on and off, control whether they boot from local HDD or via PXE, enable/disable WoL support, and control serial console redirection. That's about the extent of what you can do cartridge configuration-wise. There's no BIOS with endless confusing options here. It's dead simple.

Update 5/15/2015
I've got an RMA m300 cartridge on my desk, and just noticed that it has a 42mm M-keyed M.2 slot. This slot isn't mentioned in any quick specs / data sheet / compatibility guide that I can find, but HP tell me that it's possible to order m300 cartridges with the following M.2 storage options:

HP Moonshot 64G SATA VE M.2 2242 FIO Kit
HP Moonshot 32G SATA VE M.2 2242 FIO Kit

Friday, April 3, 2015

Exporting RSA keys from Cisco ASA: Harder than it should be

Unlike Cisco IOS routers, which by default don't allow RSA private keys to be exported from NVRAM, Cisco ASAs don't protect private keys. But there's no command (of which I'm aware) to directly export the keys either.

Sometimes you need to squirrel away those keys. You can do it by getting a certificate that uses the keys, then exporting a certificate bundle (with private key included). Here's how.

First, create a key:
 crypto key generate rsa label mykey modulus 2048  

Next, create a trustpoint which references the key, and generate a self-signed certificate:
 crypto ca trustpoint throwaway  
  keypair mykey  
  enrollment self  
 crypto ca enroll throwaway noconfirm  

Now the throwaway trustpoint has a certificate. Export that certificate to the terminal.
 no terminal pager  
 crypto ca export throwaway pkcs12 <passphrase>  

Save the blob of text including the begin/end lines. The blob is a PKCS12 bundle encrypted using the passphrase above and then base64 encoded. Be sure to save the encryption passphrase.
 -----BEGIN PKCS12-----  
 -----END PKCS12-----  

We no longer need the certificate or the throwaway trustpoint in which it's stored. Kill it. The private key will survive.
 no crypto ca trustpoint throwaway noconfirm  

The easiest way to get the key onto an ASA is to import the PKCS12 blob using the passphrase. Importing the certificate will create 3 things on the ASA:
  • The RSA keypair
  • The certificate
  • A trustpoint to hold the certificate
The keypair will be named the same as the trustpoint. To make the keypair named 'my-imported-key', import it like this, pasting in the text blob when prompted, then typing 'quit'.
 crypto ca import my-imported-key pkcs12 <passphrase>  

Now the key is available for use, but there's a useless certificate and trustpoint as well. Kill those off just like before. The key will survive.
 no crypto ca trustpoint my-imported-key noconfirm  

Another option is to extract the key from the PKCS12 bundle using openssl on some other device. First, save the text blob without the BEGIN/END lines to a file. I'd probably call it throwaway.p12.base64. Then, it needs to be base64-decoded, and parsed from a pkcs12 certificate bundle into a pem-formated private key. The private key output contains both the private and public keys.
 base64 -D throwaway.p12.base64 | openssl pkcs12 -nocerts -nodes -password pass:<passphrase>   
 MAC verified OK  
 Bag Attributes  
   localKeyID: 00 00 00 01   
   friendlyName: cn=lab-asa-1,  
 Key Attributes: <No Attributes>  
 -----BEGIN PRIVATE KEY-----  
 -----END PRIVATE KEY-----  

The example above was run on MacOS, where the base64 binary has BSD heritage.  On Linux, use -d rather than -D with the GNU flavor of base64.