Posted by Jason | Posted in Technology | Posted on 04-07-2008
It's been nearly two years since we ventured into the wonderland of replacing our Alteon gear with the Sun N1216. It was a big risk because load balancers are interlaced tightly with our multi-phased mail logistics architecture. To say the least, we have not been disappointed. The Sun N1216 series is by far the best load balancer we've ever worked with. Almost limitless power (~3Gbps) for a $25K list price. (Its big brother the N2120 was the Bugatti Veyron of the load balancer world.) But more than power, the N series provides an incredibly elegant and powerful virtualization that is irreplaceable. It enabled us to reduce what were multiple pairs of Alteons down to a single pair of N1216s running multiple virtual load balancer instances.
But what blew us away was a very simple feature we'll call “assignable virtual IP address (VIP)”. Assignable VIP functionality allows you to create two virtual load balancers (internal and external) with no routing in common, and attach your real servers to one (internal), while advertising the VIP on the other (external). Because there is no routing path between them (all traffic hitting the VIP is essentially memory copied to the internal load balancer for SLB processing), no servers sitting in your DMZ can compromise or talk directly to your real servers. They simply can't talk to something that there's no routing path to. As a result, you have a separate clean management path to your real servers that is entirely inside your trusted network, and incredibly simplifies your topology (no ACLs!). It is by far the best application of virtualization in a network device we've ever seen. However, the halcyon days came to an end in April of 2007 when we were informed that Sun intended to EOL the entire N series and shutdown the load balancing group they had acquired with Nauticus. Given that there were no other products on the market in April of 2007 that could even remotely drop seamlessly into our new topology, we decided to wait and see what Sun might do next.
A year later not much has changed, and Sun still doesn't have a coherent strategy on load balancing to replace the N series. While our units would continue to be supported for the next 5 years, there won't be software updates, and definitely no updates to the phenomenal FPGAs that make the box scream. There are flaws in the N series that need bug updates…things that would be livable if they were going to be fixed. But in a production environment no bug fixes is simply not an acceptable strategy. So we're back in wonderland…
To cut to the chase, we talked with all the major vendors and settled down to F5, Citrix/NetScaler, and Cisco. Only Cisco, with their ACE platform, has any virtualization story whatsoever. Everyone else has no virtualization plans that they're telling their sales dudes about. All 3 can cobble together an inelegant and obfuscated configuration to allow us to maintain our topology and security stance, but none can do the “assignable VIP” magic that made Sun/Nauticus such an amazing application of virtualization and so clean to administer.
In the middle of all this, a trusted friend at Sun recommended we take a look at a new load balancing company, A10 Networks. Now A10 doesn't have virtualization in their platform today, and they definitely don't have “assignable VIP”. But they have a story and roadmap that will make any Sun/Nauticus customer get a big silly grin on their face. You'll have to talk to A10 to find out the particulars.
What does A10 have? A phenomenal architecture on paper, and sane licensing. While FPGAs are what made the Nauticus design scream, being entirely FPGA and ASIC driven was also what drove the cost of bug fixes up. It was difficult for them to add L4/L7 features at the same rate that F5 and others were, because it usually required a modification of the FPGA layout. Enter what appears to be a brilliant design compromise and excellent capitalization on the Intel/AMD race for core count. The A10 AX2200 and above have L2/L3 ASICs, SSL ASICs, and a L4/L7 traffic director FPGA. The FPGA dynamically assigns new connections to each of the box's 4-8 Xeon cores for full L4/L7 processing. Also, each core operates independently from the others. That is to say, there is no contention or synchronization penalties for using more cores. Add more connections and the traffic FPGA evenly distributes them among the cores, and stitches the results back together for the client. Near perfect parallelization. All of the heavy L4/L7 lifting is done entirely in software on generic Xeon cores. This allows A10 to quickly add the complex features (like F5) that would have required an FPGA modification on the Nauticus gear. The excellent parallelization model ensures the performance hit encountered by using generic CPUs instead of FPGAs can be made up for linearly by adding buckets of cores. The FPGA is therefore much simpler in design than what Nauticus required. But as I said, this is all on paper.
However, it is an equally seductive design to what Nauticus created. F5, NetScaler and Cisco all have L2/L3 ASICs in their boxes but nothing really significant in terms of hardware acceleration in the L4/L7 areas (F5 does have their L4 ASIC that does provide good acceleration of basic L4 TCP termination load balancing). So we've decided to leap again and take a chance with A10. Also, A10 includes Global Server Load Balancing for free and does not engage in F5's hideous practice of licensing HTTP compression and SSL offload capacity by the MB/s…oh and A10 has TCL-based aRules.
So we eagerly awaited the FedEx guy on Thursday to deliver our new pair of AX2200s for validation testing. With a 100lb thump they landed solidly on our testing table, and a couple flicks of a box cutter later…
What came out of the box looked like the unholy progeny of a Sega Master System and the portholes from a Buick Roadmaster. Needless to say she ain't a looker. Frankly, at this price level the gear should be drop-dead sexy. Yes, it may be shallow, but its a requirement when you're trying to justify an $80 grand list price. To add insult to injury, the portholes don't serve any utilitarian purpose like cooling…they're actually a solid piece of plastic. As a counter example, the N2120 and Sun's standard 2U server design are phenomenal:
They exude the simplicity and power that's concealed inside…a little glimpse to the upper echelons of what you're spending the company's hard earned bananas on. But what the AX2200 gets right is spot on build quality. It's solid with no rattles. The power supplies slide smoothly and easily. Re-seating a supply gives a firm click and solidly locks them from removal. Overall, it's downright Teutonic in construction. Sort of like an older Audi S8, built to run forever like greased lightning, but not much to look at. A10 could take Audi's cue and start paying attention to creating looks that match the engineering.
One very nice feature of the AX2200 for a load balancer is the hot swap fan tray. Not having to spirit the whole unit back to Boston because a fan went South is a nice change from the N1216. Also, the interior build quality is just as clean and professional as the exterior components. Hard edge connectors and system board tracings are used almost entirely, with nearly no ribbon cables cluttering up the interior. Only nit is the front management NIC is run to the motherboard via an RJ-45 cable routed to the back. Don't let the server exterior of this box fool you, this is a purpose built system with specialized ASICs and FPGAs inside.
As with any new appliance, this one has a couple of strange design foibles that go deeper than its looks. First, the box vents in from the sides and exhausts out the back. In that regard, its neither quite at home in a rack with your servers or with your switching and routing gear. The strange intake flow means if you rack the AX2200 above your side-to-side vented switching gear, you'll likely overheat the AX2200 as it sucks in the switches' side exhaust air. Luckily, we have some Juniper kit that vents front to back, so we will likely rack the AX2200s with them. Also, the locking drive carriers are a bit frustrating. It's a nice feature that they can be locked, but inserting the key with any more force than a gnat breaking wind pops out the removal handle. It's obviously an off-the-shelf carrier that no designer actually tried before spec'ing it out of the part book.
On the positive side, the serial connection is on the front and is a Cisco-style RJ-45. Yippeee! No RS-232-to-rollover adapters to hook it into our Dominion SX! It may seem like a small thing, but it really means fewer parts to lose, break and stock at the data center. I wish I could say they had the foresight to also put a sticker on the front with the box's serial number…but unfortunately not so much.
You'd better note the serial number before you rack the AX2200, otherwise its going to be crane and strain time to see the sticker on the bottom of the unit. Scratch that…the serial number is conveniently placed on the rear left of the unit as well. It's not as easy to see as the front given the Also, they did show the company's Foundry Networks pedigree by shipping a very Foundry Networks-esque self-test sheet with the unit:
Kudos for the self-test paperwork. If you keep that on file, you can probably forgive the serial number sticker's ill fated position on the unit's underside.
Overall, first impressions…the construction and major design decisions are terrific. This box looks the part internally, and feels the part externally as a major piece of core infrastructure. Next step is to rack her and beat the heck out of her with our test rig: a screaming UltraSPARC T2. WIll post more soon on how the AX2200 stands the scorching…