Before we start: if you’re new to my blog (or stumbled upon this blog post by incident) you might want to read the Considerations for Host-Based Firewalls for a brief overview of the challenge, and my explanation why flow-tracking tools cannot be used to auto-generate firewall policies.
As expected, the “you cannot do it” post on LinkedIn generated numerous comments, ranging from good ideas to borderline ridiculous attempts to fix a problem that has been proven to be unfixable (see also: perpetual motion).
While I keep telling you that Google-sized solutions aren’t necessarily the best fit for your environment, some of the hyperscaler presentations contain nuggets that apply to any environment no matter how small it is.
One of those must-watch presentations is Fault Tolerance through Optimal Workload Placement together with a wonderful TL&DR summary by the one-and-only Todd Hoff of the High Scalability fame.
Every now and then I call someone’s baby ugly (or maybe it was their third cousin’s baby and they nonetheless feel offended). In such cases a common resort is to cite business or market needs to prove how ignorant and clueless I am. Here’s a sample LinkedIn comment talking about my ignorance about the need for smart NICs:
The rise of custom silicon by Presando [sic], Mellanox, Amazon, Intel and others confirms there is a real market need.
Now let’s get something straight: while there are good reasons to use tons of different things that might look inappropriate, irrelevant or plain stupid to an outsider, I don’t believe in real market need argument being used to justify anything without supporting technical facts (tell me why you need that stuff and prove to me that using it is the best way of solving a problem).
A friend of mine involved in multiple Cisco ACI installations sent me this comment on their tenant connectivity model:
I’m a bit allergic to ACI. The abstraction is mis-aligned with familiar configurations, in particular contracts being independent of and over-riding routing, tenants, etc. You can really make a mess with that, and I’ve seen some! One needs to impose some structure, naming conventions…, and most people don’t seem to get that done.
As I noticed in the NSX-or-ACI webinar, it’s interesting how NSX decided to stay with the familiar VLAN/routing/filtering paradigm (more details), whereas the designers of Cisco ACI decided to go down a totally different path.
You might remember my occasional rants about lack of engineering in networking. A long while ago David Barroso nicely summarized the situation in a tweet responding to my BGP and Car Safety blog post:
If we were in a proper engineering we’d be discussing how to regulate and add safeties to an important tech that is unsafe and hard to operate. Instead, we blog about how to do crazy shit to it or how it’s a hot mess. Let’s be honest, if BGP was a car it’d be one pulled by horses.
In June 2020 I published the first part of Redundant Server Connectivity in Layer-3-Only Fabrics article describing the target design and application-layer requirements.
Justin Pietsch published a fantastic recap of his experience running OSPF in AWS infrastructure. You MUST read what he wrote, here’s the TL&DR summary:
- Contrary to popular myths, OSPF works well on very large leaf-and-spine networks.
- OSPF nuances are really hard to grasp intuitively, and the only way to know what will happen is to run tests with the same codebase you plan to use in production environment.
Dinesh Dutt made similar claims on one of our podcasts, and I wrote numerous blog posts on the same topic. Not that anyone would care or listen, it’s so much better to watch vendor slide decks full of latest unicorn dust… but in the end, it’s usually not the protocol that’s broken, but the network design.
When someone sent me a presentation on seamless MPLS a long while ago my head (almost) exploded just by looking at the diagrams… or in the immortal words of @amyengineer:
“If it requires a very solid CCIE on an obscure protocol mix at 4am, it is a bad design” - Peter Welcher, genius crafter of networks, granter of sage advice.
This blog post was initially sent to the subscribers of my SDN and Network Automation mailing list. Subscribe here.
Adam left a thoughtful comment addressing numerous interesting aspects of network design in the era of booming automation hype on my How Should Network Architects Deal with Network Automation blog post. He started with:
A question I keep tasking myself with addressing but never finding the best answer, is how appropriate is it to reform a network environment into a flattened design such as spine-and-leaf, if that reform is with the sole intent and purpose to enable automation?
A few basic facts first:
One of the readers commenting the ideas in my Disaster Recovery and Failure Domains blog post effectively said “In an active/passive DR scenario, having L3 DCI separation doesn’t protect you from STP loop/flood in your active DC, so why do you care?”
He’s absolutely right - if you have a cold disaster recovery site, it doesn’t matter if it’s bombarded by a gazillion flooded packets per second… but how often do you have a cold recovery site?
A long while ago I decided to write an article explaining how you could run VMware NSX on ESXi servers with redundant connections to two top-of-rack switches on top of a layer-3-only fabric (a fabric with IP subnets and VLANs limited to a single top-of-rack switch). Turns out that’s Mission Impossible, so I put the article on the back burner and slowly forgot about it.
Well, not exactly. Every now and then my subconsciousness would kick it up and I’d figure out yet-another reason why it’s REALLY hard to do it right. After a while, I decided to try again, and completely rewrote the article. The first part is already online, more details coming (hopefully) soon.
I got this question about the use of AS numbers on data center leaf switches participating in an MLAG cluster:
In the Leaf-and-Spine Fabric Architectures you made the recommendation to have the same AS number on all members of an MLAG cluster and run iBGP between them. In the Autonomous Systems and AS Numbers article you discuss the option of having different AS number per leaf. Which one should I use… and do I still need the EBGP peering between the leaf pair?
As always, there’s a bit of a gap between theory and practice ;), but let’s start with a leaf-and-spine fabric diagram illustrating both concepts:
Whenever I was comparing VMware NSX and Cisco ACI a few years ago (in late 2010s in case you’re reading this in a far-away future), someone would inevitably ask “and how would you connect a bare metal server to a VMware NSX environment?”
While NSX-T has that capability since release 2.5 (more about that in a later blog post), let’s start with the big question: why would you need to?
Got mentioned in this tweet a while ago:
Watching @ApstraInc youtube stream regarding BGP in the DC with @doyleassoc and @jtantsura.Maybe BGP is getting bigger and bigger traction from big enterprise data centers but I still see an IGP being used frequently. I am eager to have @ioshints opinion on that hot subject.
Maybe I’ve missed some breaking news, but assuming I haven’t my opinion on that subject hasn’t changed.
One of the attendees of our Building Next-Generation Data Center online course submitted a picture-perfect solution to scalable layer-2 fabric design challenge:
- VXLAN/EVPN based data center fabric;
- IGP within the fabric;
- EBGP with the WAN edge routers because they’re run by a totally different team and they want to have a policy enforcement point between the two;
- EVPN over IBGP within the fabric;
- EVPN over EBGP between the fabric and WAN edge routers.
The only seemingly weird decision he made: he decided to run the EVPN EBGP session between loopback interfaces of core switches (used as BGP route reflectors) and WAN edge routers.