Published: Designing Scalable Web Applications
The first batch of the latest materials for my Designing Scalable Web Applications course have been published on my free content web site.
So You Need ISSU on Your ToR switch? Really?
During the Cumulus Linux presentation Dinesh Dutt had at Data Center Fabrics webinar, someone asked an unexpected question: “Do you have In-Service Software Upgrade (ISSU) on Cumulus Linux” and we both went like “What? Why?”
Dinesh is an honest engineer and answered: “No, we don’t do it” with absolutely no hesitation, but we both kept wondering, “Why exactly would you want to do that?”
Video: Scale-Out NAT
Network Address Translation (NAT) is one of those stateful services that’s almost impossible to scale out, because you have to distribute the state of the service (NAT mappings) across all potential ingress and egress points.
Midokura implemented distributed stateful services architecture in their Midonet product, but faced severe scalability challenges, which they claim to have solved with more intelligent state distribution.
Vertically Integrated Musings
Packet Pushers podcast is a constant source of inspiration for my blog posts. Recently I stumbled upon Rob Sherwood’s explanation of how they package Big Cloud Fabric:
It’s a vertically integrated solution, from Switch Light OS to our SDN controller and Big Cloud Fabric application.
Really? What happened to openness and disaggregation?
Video: Implementing VLAN-aware Bridge with OpenFlow
Reinventing the wheels makes little sense. Implementing old solutions with new tools might be in the same category, but at least it shows you the power and shortcomings of the new tools.
Building a VLAN-aware bridge in OpenFlow is thus a mandatory case study, and as you’ll see in the video from the OpenFlow Deep Dive webinar, it’s not as easy as it looks. For more details, watch the whole OpenFlow webinar (6 hours of in-depth videos), which you also get by buying Advanced SDN Training or ipSpace.net subscription.
Turn Your Training or Presentation into a Story
If you’re a regular reader of this blog, you know I always prefer knowledge over recipes. Unfortunately, it’s pretty hard to build that knowledge using the widely available training materials, which often just blast you with a barrage of facts that you’re supposed to memorize and deliver at the certification exam.
How about turning your training into a South Park episode?
Case Study: Scale-Out Cloud Infrastructure
I helped several customers design scale-out private or public cloud infrastructure. In every case, I tried to start with a reasonably small pod (based on what they’d consider acceptable loss unit – another great term I inherited from Chris Young), connected them to a shared L3 backbone (either within a data center or across multiple data centers), and then tried to address the inevitable desire for stretched layer-2 connectivity.
You’ll find a summary of these designs in my next ExpressExpress case study: Scale-Out Private Cloud Infrastructure, and if you need more details, I’m usually available for online consulting.
Network Monitoring in SDN Era on Software Gone Wild
A while ago Chris Young sent me a few questions about network management in the brave new SDN world. I never focused on network management, but I know a few people who do, including Terry Slattery and Matt Oswalt. Interop brought us all together, and we sat down one evening after the presentations to chat about the challenges of monitoring and managing SDN networks.
We started with easy things like comparing monitoring results from virtual and physical switches (and why they’ll never match and do we even care), and quickly diverted into all sorts of potential oscillations caused by overly-dynamic load balancing caused by flow label-based ECMP and flowlets.
Don’t Be Overly Enthusiastic about Vendor Claims (This Time It's Brocade)
I was running the first part of the Data Center Fabrics Update webinar last week, mentioned that Brocade VDX 6740 supports Flex ports (a port you can use as Fibre Channel or 10GE port), and someone immediately wrote a comment saying “so does VDX 6940”. I was almost sure Flex ports aren’t available on VDX 6940 yet, and as always turned to vendor documentation to figure it out.
As expected, the data sheet is a bit vague, somewhat reflecting reality, but also veering into the realm of futures instead of features. Here’s what they say:
Link Aggregation in OpenFlow Environment
One of my readers couldn’t figure out how to combine Link Aggregation Groups (LAG, aka Port Channel) with OpenFlow:
I believe that in LAG, every traditional switch would know how to forward the packet from its FIB. Now with OpenFlow, does the controller communicate with every single switch and populate their tables with one group ID for each switch? Or how does the controller figure out the information for multiple switches in the LAG?
As always, the answer is “it depends”, and this time we’re dealing with a pretty complex issue.
vSphere 6 Networking Deep Dive Webinar Is Complete
Last week we finished the last session of vSphere 6 Networking Deep Dive webinar – 6 hours of downloadable videos covering every single vSphere 6 networking topic are waiting for you.
As always, you get access to the webinar with your ipSpace.net subscription, or you can buy just this webinar, or one of the bundles that include it: Data Center track or Data Center Trilogy.
Segment Routing 101 on Software Gone Wild
With all the hype around Segment Routing we said: “let’s chat about it, what could possibly go wrong”. The result: Episode 33 of Software Gone Wild. We didn’t get very far into the technical details, but you might still find the overview useful (or not – do tell me how good or useless it is).
Stupidities of Switch Programming (written in June 2013)
In June 2013 I wrote a rant that got stuck in my Evernote Blog Posts notebook for almost two years. Sadly, not much has changed since I wrote it, so I decided to publish it as-is.
In the meantime, the only vendor that’s working on making generic network deployments simpler seems to be Cumulus Networks (most other vendors went down the path of building proprietary fabrics, be it ACI, DFA, IRF, QFabric, Virtual Chassis or proprietary OpenFlow extensions).
Arista used to be in the same camp (I loved all the nifty little features they were rolling out to make ops simpler), but it seems they lost their mojo after the IPO.
Do We Need NAC and 802.1x?
Another question I got in my Inbox:
What is your opinion on NAC and 802.1x for wired networks? Is there a better way to solve user access control at layer 2? Or is this a poor man's way to avoid network segmentation and internal network firewalls.
Unless you can trust all users (fat chance) or run a network with no access control (unlikely, unless you’re a coffee shop), you need to authenticate the users anyway.
Scaling OpenStack Security Groups
Security groups (or Endpoint Groups if you’re a Cisco ACI fan) are a nice traffic policy abstraction: instead of dealing with subnets and ACLs, define groups of hosts and the rules of traffic control between them… and let the orchestration system deal with IP addresses and TCP/UDP port numbers.