Wednesday, July 21, 2010

Nexus Data Center Switching Design - Part 1

This post details how I might have built a Cisco Nexus 5000/2000 switching environment last year. All of the pricing and product availability information is late 2009 vintage.

Recently, I was running through this same design exercise again with
new fabric extenders and some updated requirements in mind. Subtle differences in the newest Nexus gear changed the result much more than I expected. This series of posts will detail the designs, with the intent of highlighting the new 2248TP fabric extender. A follow up post will detail the new gear, and how it changes everything.

Last Year's Design

  • Several rows of top-of-rack 1Gb/s copper switching
  • Full redundancy, including redundant supervisors (in the FEX world, we'll use vPC FEX uplinks to satisfy this requirement)
  • Between 3 and 9 links on each server (one link on each server is an iLO module)
  • Multiple 10Gb/s upstream links from each row
  • Copper says in-cabinet wherever possible
  • Limited intra-row cabling is okay
  • Keep an eye on the budget
The resulting design splits the environment into 6-rack rows, with the following gear in each row:
  • Two Nexus 5020s (rack 3 and rack 4)
  • Twelve Nexus 2148T (two per rack)
  • Six Catalyst 2960s (one per rack)
  • One Lantronix SLC
The result looked something like this

Inter-row connectivity consists of:
  • 216 multimode fiber pairs (36 pairs in each rack). These are mostly used by servers for storage, but also by the Nexus 5020s for uplink to a Nexus 7000 vPC pair. Open cassette positions in each panel allow for growth.
  • One 24-pair copper bundle (an AMP MRJ21 cable) terminates on the copper panel in the third rack of every row. There was no good way to avoid inter-row copper altogether because of the Nexus mgmt0 interfaces, and 8 console connections in each row. It's a 6-port panel with 5 connections in use: Each Nexus 5020 uses one, and the Lantronix uses two for Ethernet plus a third for it's own console. Lantronix interface redundancy is buggy, so only one of the NICs works correctly. More about that in a future post.
Inter-rack connectivity within each row consists of:

  • 48 TwinAx cables of various lengths connect the FEXes to the Nexus 5020s.
  • 6 TwinAx cables connect between the Nexus 5020 pair.
  • vPC copper uplink from the Catalyst 2960s to the Nexus 5020 pair using GLC-T modules (because the first 16 ports on the Nexus can do 1Gb/s).
  • Copper management interfaces on the 5020s and Lantronix connect to a central management net via the patch panel in rack 3.
Every server NIC and HBA is patched straight to the top of rack Nexus, Catalyst or fiber panel for SAN. Server cabling within the racks is very clean and straightforward.

Here's how some of these design elements came together
The Nexus 5020s are in-row because of the pricing of 10Gb/s twinax cables. Last year, building a 10Gb/s link on multimode fiber required $1800 SR transceivers... On both ends of the cable! The price of those transceivers has since come down to $1500, but a twinax cable doing the same job lists for as little as $150. The problem with twinax is their length limitation: 5m maximum. Moving the 5020s out of close proximity to the fabric extenders would cost around $160,000 using SR transceivers!

The 6-rack row length came from a design requirement and a FEX limitation. The requirement was that "everything be dual-sup'ed". With redundant supervisors in a chassis switch, all line cards and ports continue to operate after a supervisor failure. The closest analogy in the FEX environment is to uplink the FEXes to a vPC Nexus 5020 pair. I don't love the vPC FEX layout, but let's assume that it's a layer-8 requirement that can't be worked around. Both 5020's in each row connect to all 12 FEXes. ...And that's the limitation. A Nexus 5000 can only address 12 FEX modules. There's a rumor that the limit may be raised.

The 2960s are here because iLO interfaces on the server do 10/100 Mb/s, and the 2148T can only do gigabit. The iLO ports need to plug in somewhere! Also, the port density was getting kind of high: As it is, each rack can support at most 12 of those 9-NIC servers. Fortunately the average NIC density is less than 6 NICs per server, so the racks will run out of space shortly before the network runs out of ports.

I'd considered ways to reduce the iLO switch management footprint by using the 4506-E bundle (great pricing on that bundle) at the center of the row, or a Catalyst 3750 stack braided across the top of the row:
The added expense of those solutions just didn't make sense. The Catalyst 2960s (that's a plural -- not the new 2960-S) are so cheap, and their configuration so simple that managing 48 little switches for 10/100 connections seemed worthwhile. Note that while the 2960's are linked in-row to the Nexus 5020s, the WS-C2960-24TC-S has dual-purpose uplink ports. Should the need to free up 5020 ports ever arise, the 2960s can be homed elsewhere via the in-rack multimode panels.

The single copper panel would have been nice to avoid, but it just wasn't possible without resorting to hokey media converters for the Nexus and Lantronix management ports. During an early stage of the build, I had these copper links running to the 2960s, but that creates a chicken-and-egg problem (the egg came first, BTW): I planned to use the Lantronix SLC to put the initial configuration on all of the other devices, but couldn't get to the SLC until the Nexus 5000 and Catalyst 2960 were configured! The cabling folks eventually showed up with the copper, but not until after I'd spent a long time dangling from a blue cable.

Each 8-port Lantronix SLC unit is fully populated: 2 Nexus 5020s and 6 Catalysts. Additionally, each SLC's own console port (for managing the SLC) is patched back to a central SLC. The 25-pair cable feeding the copper patch panel includes enough pairs to carry all console and management traffic, but it would have required funny wiring to handle both Ethernet and console connections. An in-row SLC is going to be easier to manage in the long run.

The Nexus 5000 and 2000 are configured with front-to-back airflow, which in this case, means that the exhaust is on the same end as the interfaces. Intake, accordingly, is on the other end. That's great for the Nexus 5000: It's 30 inches long, so it will be inhaling air from the cold side of the rack. The 2148T fabric extender on the other hand, is only 20 inches deep, putting its air intake in the middle of the rack. The Catalyst 2960 draws air in on its sides, and exhausts backwards into the middle of the rack. T
he Lantronix SLC does something equally unfortunate, though the specifics elude me. So, it's a bit of a challenge to keep the FEX, Catalyst and SLC cool. Putting the first server at the top of the rack (rather than the bottom) will help isloate this gear, and will prevent hot server exhaust from pooling in the open cavity at the top of the rack. Installing blanking panels on the hot side of the rack (rather than the aesthetically pleasing cool side) will help too. Fortunately, the FEX fans move enough air to keep stuff up there from getting too hot. No they don't.  Consider your cabinet's airflow requirements carefully, consider adding baffles to prevent hot air getting to the FEX intake.  Blanking panels at the cold side are no help.  Blanking panels at the hot side might not be enough if air can rise up the sides of the cabinet between the front and rear mounting rails.

Wiring Detail
Here's how the Nexus gear in each row is wired together.

Get out your wallet
This is an expensive build, but cost was one of the design drivers. For the pricing exercise and comparison with the new products, it's important to know that these rows uplink to a Nexus 7000 via SR optics. The 7000 is populated with N7K-M132XP-12 cards, and they're oversubscribed 2:1.

A single row, including the downlink optics installed in the core consists of the following, which together list for $188,350:

Part Number Description Quantity
N5020P-N2K-BEC Cisco Nexus 2148T and Nexus 5020 Bundle with Twinax cables 2
SFP-H10GB-CU1M= 10GBASE-CU SFP+ Cable 1 Meter 14
SFP-H10GB-CU3M= 10GBASE-CU SFP+ Cable 3 Meter 16
SFP-10G-SR= 10GBASE-SR SFP Module 4
WS-C2960-24TC-S Catalyst 2960 24 10/100 + 2 T/SFP LAN Lite Image 6
GLC-T= 1000BASE-T SFP 12

Additionally, we'll need 4 N7K-M132XP-12 cards to install in the core switches. These list for $70,000 each.

Oh, and the Lantronix SLCs cost around $1,200 each

Alltogether, this build provides 1728 RU of server space and 4608 gigabit switchports. It's oversubscribed 2:1 at the core, 6:1 at the 5020 layer, and 1.2:1 at the FEX (assuming all ports get plugged in). Spanning tree isn't blocking any links.

List price: $1,796,400 including the Nexus 7000 parts, but not the Nexus 7000 itself.

1 comment:

  1. You should price out a Force10 solution for your next project. Much cheaper and linerate.