Tuesday, February 1, 2011

Load Balance Until It Hurts

A few months ago, Ethan wrote a great article detailing a basic network design consisting of layer-3 distribution switches and layer-2 access switches.

Included in that design is the common strategy of splitting STP root bridge duty between the two distribution switches in order balance traffic across access layer uplinks:  Distribution switch A is configured as STP root for odd-numbered VLANs, and STP root for even-numbered VLANs goes to distribution switch B.

In the comments to Ethan's article I said that, while the strategy is perfectly valid and commonly implemented, it's definitely not a one-size-fits-all scenario.  About half of my customers choose to not balance traffic in this manner because they:
  • Can easily pay for twice the bandwidth they truly require
  • Expect required bandwidth to be available at all times, even during failures
  • Don't want to be surprised by a sudden reduction in available bandwidth

Balancing VLANs across uplinks will result in reduced transit capacity during a failure.  Some environments would rather take advantage of extra capacity when it's available (most of the time), while others demand consistent network behavior.

Ethan didn't mention it, but the FHRP mechanism usually gets balanced in lockstep with the STP root.  The minimally-articulated result looks something like this:



When discussing this design, lots of network folks will tell you something like:
You have to put your HSRP primary and STP root on the same box!
My ears perk right up when a design detail makes the transition from "It's a good idea to X because of Y" into "You have to Z."

"Have to?"  There's no shortage of engineers who believe this is an absolute requirement.  Sure, it's nice, but is it required?  Meh.  This particular bit of dogma has perplexed me for a while.  I think the goal here relates to an attempt to avoid the extra east/west bridging hop between distribution switches.  It's a worthy goal, but it:
  1. Only impacts outbound traffic.  Inbound traffic has a 50/50 chance of being routed in by the "wrong" distribution switch, and crossing the east/west link anyway.  Given the inbound-heavy traffic patterns on a typical desktop LAN, it seems like misplaced focus.
  2. Guarantees the appearance of the nasty asymmetric-routing / unknown-unicast-flooding problem:  Where will the "wrong" distribution switch bridge your traffic if he never hears from you?  The fix is easy, but rarely implemented.  This bugger can be a much bigger problem than abusing the cross-link, especially in a mixed-speed environment with shared ASICs and buffers.
I'm not saying "don't balance your traffic", nor am I saying "don't align STP and FHRP."  I'm saying: "Know your network, know your traffic, and question authority."

Okay, having covered these facets of a popular design, it's time to get to the point.  When does load balancing start to hurt?  It hurts when we decide to use GLBP instead of HSRP or VRRP.  GLBP is sexy because it allows multiple active gateway routers.  In a small office with just two routers and two WAN links this is great:  Outbound user traffic can get proportionally balanced across WAN links very simply.  But what about in our ECMP campus?
  • There's no load sharing advantage because that workload typically gets balanced by distributing HSRP priority among VLANs.
  • There's no Distribution->Core advantage because CEF on the (already balanced) distribution switches will balance upstream traffic.
There are real L2/L3 enterprise networks like this out there.  They have carefully-groomed STP root bridges using carefully-prioritized GLBP.  The GLBP priorities are tweaked to coordinate them with the preferred STP root.  GLBP preemption is enabled to make darn sure that the intended switch is the live Active Virtual Gateway.  Like this:




Distribution ADistribution B

interface vlan 11
 glbp 0 preempt
interface vlan 12
 glbp 0 priority 90
interface vlan 13
 glbp 0 preempt
interface vlan 14
 glbp 0 priority 90

interface vlan 11
 glbp 0 priority 90
interface vlan 12
 glbp 0 preempt
interface vlan 13
 glbp 0 priority 90
interface vlan 14
 glbp 0 preempt

This hurts here because it runs directly counter to the load-balancing, hop-avoiding philosophy underpinning this design.  The only thing that's being tuned here is the location of the GLBP AVG (the box that answers ARP queries.)  Forwarding workload gets split between distribution switches.  50% of outbound traffic (the only thing we can control easily) is now forced to make the extra hop between distribution switches.  The priority and preemption tuning here just amounts to extra typing.  It really doesn't make any difference which switch answers ARP queries in this scenario, and the design guarantees higher latency and link utilization than the more common HSRP configuration.

Sure, it works fine.  But it's a lot of extra typing, and results in ever-so-slightly worse performance.

No comments:

Post a Comment