Why Is Every SDN Vendor Bashing the Networking Engineers?
This blog post was written in 2014 (and sat half-forgotten in a Word file somewhere in my Dropbox), but as it seems not much has changed in the meantime, it’s time to publish it anyway.
I was listening to the fantastic (now gone) SDN Trinity podcast while biking around Slovenian hills and almost fell off the bike while furiously nodding to a statement along the lines of “I hate how every SDN vendor loves to bash networking engineers.”
There’s a pretty good reason for that behavior: the vendors know they wouldn’t be able to sell their latest concoctions to people who actually understand how networking works and why some architectures have no chance of ever working in real life (see also RFC 1925 section 2.11). The only way to sell the warez is to try to convince everyone else how to get rid of the pesky ossified CLI jockeys.
To be honest, we tend to be somewhat cautious (the storage engineers are no better), but the real reasons for the morass we’re dealing with today lies somewhere else: the application development and deployment process are often totally broken, and so the problems get pushed down the stack and increase the complexity of the lower layers.
No technology will ever solve that problem. We have to change the IT processes, but of course that’s not the message that would sell new shiny gadgets. It’s so much easier to blame the network.
I've no doubt vendors would pitch a case to the business that it will get the return on wages not paid to network engineers and that saving would ultimately turn into a revenue stream for these vendors.
Until such time where something goes horribly, seriously wrong and nobody has the skills anymore to troubleshoot with any depth, as its so abstracted that all the vendors have done to you is push complexity with the veneer of simplicity and the vendors would profit again by selling you "professional services" to fix the issue, or selling you additional licenses etc.
These platforms aren't being sold to "us" to help us...
Wasn't your recent blog post..."Complexity Sells"? perhaps "Vendor Fluff and Ignorance" also sell too...and there'd be plenty more ignorance if there were less network engineers.
As network guys we try to solve world hunger with our very limited tool set. For example we throw at BGP all sorts of problems because that is what we "know" while others looking at routing differently re-invented peer to peer networking, where gazillions (that is a lot of...) of hosts and their content are discovered and reached every day... but what do they know, they are not CCIEs. (not saying we should replace BGP by one of the bit torrent protocols but we could learn a few things from the experience). Things evolve as human nature is reluctant to change in general.
The SDN disruption is somewhat simillar to what MPLS brought late 90s early 00s. The "real" network engineers at the time were dealing with ATM, Frame-Relay, etc. and IP was still seen as a best effort Enterprise protocol, with the Internet which was starting to get more widely deployed. I remember long discussions with financial institutions that argued that they would NEVER abandoned their leased lines for a shared, unreliable transport such as MPLS. Find me one of these banks today which is not using MPLS ?
Moving complexity to another layer is not new either as RFC 1925 stated years ago (btw a BIG fan of this RFC) and yes certain problems are better served elsewhere than in the IP stack. As someone who has spend 30 years in this industry, I am always humbled and impressed by the problem solving creativity I encounter at clients.
I think you are right about one of the trends, when you say IT is "broken", actually not so much broken but misconfigured for accelerated delivery and moving to XaaS (pick your "X") as we are living today.
SDN is here to stay but we will see what form it will take over time. We better get use to it: you can't fight the tide but you can learn to surf ;)
The silo boundary between application and server / VM administration has weakened in the DevOps era.
The silo boundary between application and storage is starting to weaken, particularly for modern applications which use object or other non-SCSI-based storage.
The silo boundary between applications and networking still stands. The lack of understanding and transparency across this boundary not only leads to suboptimal run time results (remember last year's database app which was managing a pool of connections in a way that broke ECMP, and the resulting desire for the database to provide the ECMP hash instead of letting the network do so?), but also to disjoint processes (despite progress, I daresay we won't see total Ansible/Puppet/Chef setup of networks in the general case), and we/they between the network silo people (as evidenced in the thesis of this blog and the comments above).
I remain firm in my conviction, despite the reality that the gross margin which supports my paycheck would be impacted, that in this decade some graduate students who don't know any better will write software which is orders of magnitude smaller and simpler than what we run in data center switches today, aligned with Ansible / Puppet / Chef or some variant. They'll do this because they couldn't figure out how to accomplish what they needed to with today's networking. They won't call it SDN. But it will turn out to scale to a data center fault containment zone, be "good enough", and most importantly be completely transparent and intuitive to app developers, DevOps types, and server/VM admin types. The network silo will be pushed back to the data center boundary. Because we were too wrapped up in how we've always done things to truly understand that the "friction" at the boundary between the network silo and that which runs applications is unsustainable.
Or I could be all wrong. :)
@FStevenChalmers
(speaking for self, works in HPE networking)
Who? Network Engineers?
That friction cuts both ways. As Ivan has previously noted, we've spent decades, as an entire industry, shoving shit further and further down the stack for "those people" to solve: Network Engineers (customer and vendor). Did you decide to write an application that listens on all 65000 ports for dynamic client connections? No.. no... let's not fix the application.. let's have the network guys fix that.
What happens when an app developer needs an end-to-end "class A" scrub on a web application? Most of the time, they call senior network engineer. Somebody with the ability to capture the network bones with wireshark (or whatever) and then read them back in a way the app developer might understand.
The network is in the state it's in, almost entirely, because of demands from applications. Many design considerations in networking have nothing whatsoever to do with the network per se, but everything to do with how horribly written and insecure applications are.
We can't just be all speeds and feeds. Invariably we end up slicing and dicing the network. We invariably end up contorting the network, not because of limitations of the network, but because the revenue from potential customers always take priorities over our best laid plans. The network does not have veto power at the at the end of the Jenkins build cycle. That app is getting deployed, and we *will* accommodate it.
We are far more than just VLAN herders. Because of these all these application idiosyncrasies, *we* end up being the go-to team when the app guys have no idea what is happening.
Invariably.
The insanity of many SDN companies is that they think they will find a way to adequately encapsulate all of the chaos, and present the engineer with adequate tools to deal with all the issues they will.. invariably... have to deal with. When they poop on network engineers, they are making a bold, bold assertion. That network engineers, who have been on the receiving end up propping up thousands of crap applications for decades, can be replaced or marginalized by SDN tools that solve 10% of the problem but have beautiful interfaces and A-1 marketing team promoting it.
When I was in the server business, there were folks down the hall from me writing OS and subsystem APIs. All proprietary and opaque, back then. Sometimes the applications called them in unanticipated ways, with undesirable results. Linux brought transparency to this space -- you can at least see through the API now.
WHen I was in the storage business, we had this formal SCSI interface between the storage stack on the server and the software stack in the disk array. Each stack was proprietary, and neither group of people could really see into the other's code. Years of optimizations had led to a very complex tree of what was a fast path and what was a slow path through each piece of code, and as a result application developers occasionally wrote their code in a way that hit slow paths somewhere and brought the storage system to its knees (typically not just for themselves but for every other app as well). So we had storage admins madly trying to redistribute data to match local workload IO/s with what the poor little spindles could actually do.
Networking is more of the same, only the boundary is more opaque. As a long ago app developer (yes I've written COBOL and typed my code onto punch cards), a server developer, and then a storage developer, I saw networking through the narrow window of what the system's software stack saw -- or what WireShark running on that system could tell me. I didn't even know what I didn't know until coming over to the networking business 7-8 years ago to help with FCoE. Wow, both to the amazing things modern network devices can do (but only under control of a highly skilled network engineer typing incantations), and to how easy it is to take a network down making just one little mistake. And how hard it is to defend against denial of service (much less DDOS). The list goes on.
My long winded point here is that the only way to fix this is to stop making networking this arcane black box that only high priests can control, instead making it as open and transparent as server operating systems and storage systems are becoming. There is one huge obstacle here: networking as it stands today is simply too complex and interrelated for us to make it usefully transparent and intuitive to a typical app developer. (Note I'm not saying make the Internet backbone simple and transparent, I'd start with microservices in the same data center fault containment zone.) Whether we call it networking for hyperconverged systems, or container networking, or a server-centric data center network like Calico (Canal), I think that's one of our big challenges for the next decade.
-steve
@FStevenChalmers
"Do you architect your network around an application(s) or do you architect your application(s) around your network"
Well what's the difference then you could say? Network Engineers will still be needed in those big vendor companies to create the products in the first place!
I think it's more subtle than that: the Network Engineer has got a "last mohican" status and defended vendor-independant protocols as well as the very basic knowledge of the technologies he used to deal with. This state of mind was creating something like a critical background for litterally every decision he made.
If we go back in time, developpers have lost their soul years ago now, it started maybe with Java, maybe only with the first very popular frameworks. As soon as you create a layer to simplify something that is apparently too complex, you also abstract everything underneath. Whith that abstraction comes the direct consequence that less and less people are aware of the underlying layers, and no one gets paid to rethink it, because it doesn't sell. Sadly it's the very mission of all engineers: reconsider things from the ground up when it's needed.
Hope I'll remember that in 20 years.