Blog Posts in May 2015
A while ago Chris Young sent me a few questions about network management in the brave new SDN world. I never focused on network management, but I know a few people who do, including Terry Slattery and Matt Oswalt. Interop brought us all together, and we sat down one evening after the presentations to chat about the challenges of monitoring and managing SDN networks.
We started with easy things like comparing monitoring results from virtual and physical switches (and why they’ll never match and do we even care), and quickly diverted into all sorts of potential oscillations caused by overly-dynamic load balancing caused by flow label-based ECMP and flowlets.
I was running the first part of the Data Center Fabrics Update webinar last week, mentioned that Brocade VDX 6740 supports Flex ports (a port you can use as Fibre Channel or 10GE port), and someone immediately wrote a comment saying “so does VDX 6940”. I was almost sure Flex ports aren’t available on VDX 6940 yet, and as always turned to vendor documentation to figure it out.
As expected, the data sheet is a bit vague, somewhat reflecting reality, but also veering into the realm of futures instead of features. Here’s what they say:
Open vSwitch Database Management Protocol (OVSDB, RFC 7047) is often mentioned together with other semi-magic SDN tools that will bring everlasting peace to the chaotic world of networking. In reality, it’s just a database access/update protocol (think SQL with JSON encoding) with an interesting twist: a client can request notifications about table or row updates, replacing periodic database polling with a pub-sub solution.
One of my readers couldn’t figure out how to combine Link Aggregation Groups (LAG, aka Port Channel) with OpenFlow:
I believe that in LAG, every traditional switch would know how to forward the packet from its FIB. Now with OpenFlow, does the controller communicate with every single switch and populate their tables with one group ID for each switch? Or how does the controller figure out the information for multiple switches in the LAG?
As always, the answer is “it depends”, and this time we’re dealing with a pretty complex issue.
With all the hype around Segment Routing we said: “let’s chat about it, what could possibly go wrong”. The result: Episode 33 of Software Gone Wild. We didn’t get very far into the technical details, but you might still find the overview useful (or not – do tell me how good or useless it is).
In June 2013 I wrote a rant that got stuck in my Evernote Blog Posts notebook for almost two years. Sadly, not much has changed since I wrote it, so I decided to publish it as-is.
In the meantime, the only vendor that’s working on making generic network deployments simpler seems to be Cumulus Networks (most other vendors went down the path of building proprietary fabrics, be it ACI, DFA, IRF, QFabric, Virtual Chassis or proprietary OpenFlow extensions).
Arista used to be in the same camp (I loved all the nifty little features they were rolling out to make ops simpler), but it seems they lost their mojo after the IPO.
Another question I got in my Inbox:
What is your opinion on NAC and 802.1x for wired networks? Is there a better way to solve user access control at layer 2? Or is this a poor man's way to avoid network segmentation and internal network firewalls.
Unless you can trust all users (fat chance) or run a network with no access control (unlikely, unless you’re a coffee shop), you need to authenticate the users anyway.
Security groups (or Endpoint Groups if you’re a Cisco ACI fan) are a nice traffic policy abstraction: instead of dealing with subnets and ACLs, define groups of hosts and the rules of traffic control between them… and let the orchestration system deal with IP addresses and TCP/UDP port numbers.
Several of the conversations I had at the recent RIPE70 meeting were focused on career advice (usually along the lines of “which technology should I focus on next”) and inevitably we ended up discussing the benefits of T-shaped skills versus I-shaped skills… and I couldn’t resist drawing a few graphs illustrating them.
When preparing for my Simplifying Application Workload Migration workshop (coming in webinar format in autumn) I tried to find a solution that would allow me to recreate existing enterprise virtual network infrastructure in a cloud environment. Soon I stumbled upon Ravello Systems, remembered hearing about them on a CloudCast.net podcast, and got in touch with them to figure out whether they could help me solve that challenge.
It turned you might use Ravello Systems’ solution to implement disaster recovery, but I got way more excited about the possibility to use their solution for labs or testing. To learn more about that, listen to Episode 32 of Software Gone Wild.
Hank left a lovely comment on my Rearchitecting L3-Only Networks blog post:
What you describe is literally intra-area routing in CLNS.
He’s absolutely right (and I admitted as much during my IPv6 Microsegmentation presentations @ Troopers 15).
From the automation perspective, the RIPE conference is a dream come true – 30 seconds after you upload your presentation, it appears on the RIPE web site, it’s automatically updated on the podium computer, and the video recording of your talk is published before you even manage to get off the podium – so you can already watch my “SDN - 4 years later (aka Quo Vadis, SDN?)” presentation if you missed it yesterday.
Jsicuran left this comment on my You Must Understand the Fundamentals to Be Successful blog post:
I just went through some Cisco webinar where they were showcasing the use of NX-OS API and Python to add a VLAN. I do some Python myself and have used that API for some simple DevOps-like uses, but for the most part if you are an enterprise and use Prime DCIM to add VLANs, why should you go through the coding process?
It obviously depends on where you are in your IT automation journey.
Great news for everyone trying to deploy IPv6 in OpenStack: the Kilo release has full support for IPv6 in the tenant networks, including SLAAC, stateless and stateful DHCPv6. For more details, read an extensive blog post by Shannon McFarland.
When I finished my SDN workshop @ Interop Las Vegas (including a chapter on OpenFlow limitations), some attendees started wondering whether they should even consider OpenFlow in their SDN deployments. My answer: don’t blame the tool if people use it incorrectly.
Two days later, I discovered HP is one of those companies that knows how to use that tool.
John Jackson wrote an interesting comment on my Rearchitecting L3-Only Networks blog post:
What the host has configured for its default gateway doesn't really matter, correct? Because the default gateway in traditional L2 access networks really isn't about the gateway's IP address, but the gateway's MAC address. The destination IP address in the packet header is always the end destination IP address, never the default gateway.
He totally got the idea, however there are a few minor details to consider.
One of the topics I discussed in the IPv6 High Availability webinar is the problem of dual-stack deployments – what do you do when the end-to-end path for one of the protocol stacks breaks down. Happy eyeballs is one of the solutions, as is IPv6-only data center (Facebook is moving in that direction really fast). For more details, watch the short End-to-End High Availability in Dual Stack Networks video available with Free Subscription.
Occasionally I’d invite a vendor speaker (usually working for an interesting startup) to present in my Data Center Fabrics webinar series. Dan Backman from Plexxi was talking about affinity networking in 2013, and in the May 2015 update session we’ll have Dinesh Dutt from Cumulus Networks talking about their software platform, architectures you can build with whitebox (or britebox) switches running Cumulus Linux, exciting network automation options, and cool new features they’re constantly adding to their software.
One of my readers sent me this question:
After reading this blog post and a lot of blog posts about zero trust mode versus security zones, what do you think about replacing L3 Data Center core switches by High Speed Next Generation Firewalls?
Long story short: just because someone writes about an idea doesn’t mean it makes sense. Some things are better left in PowerPoint.
I recently read a must-read blog post by Russ White in which he argued that you need to understand both theory and practice (see also Knowledge or Recipes and my other certification rants) and got a painful flashback of a discussion I had with a corner-cutting SE (fortunately he was an exception) almost two decades ago when I was teaching my Advanced OSPF course at Cisco.
I was talking about “application-layer gateways” on firewalls and NAT boxes with a fellow engineer, and we came to an interesting conclusion: in most cases they are not gateways; they don’t add any significant functionality apart for payload fixups for those broken applications that think carrying network endpoint information in application packets is a good idea (I’m looking at you, SIP and FTP). These things should thus be called Application Layer Fixups or ALFs ;)