Category: network management
After listening to Open-Source Network Engineer Toolbox Nick Buraglio sent me an email saying “we should do another podcast on open-source network management tools…” and so we did. In Episode 56 of Software Gone Wild Nick, Elisa Jasinska and myself discussed a whole range of network management challenges and open-source tools you can use to address them.
Imagine you’d design your network by documenting the desired traffic flow across the network under all failure conditions, and only then do a low-level design, create configurations, and deploy the network… while being able to use the desired traffic flows as a testing tool to verify that the network still behaves as expected, both in a test lab as well as in the live network.
Time for another fill-in-the-blanks survey: how many vendors support NETCONF and/or REST API in their data center switches, routers, firewalls and load balancers?
Please help me complete the tables by writing a comment – and do keep in mind that it only counts if it’s documented in a public configuration guide on vendor’s web site.
Also, I’m not aware of any vendor using standard NETMOD YANG models. If someone does, please let me know.
During my recent SDN workshops I encountered several networking engineers who use Nexus 1000V in their data center environment, and some of them claimed their organization decided to do so to ensure the separation of responsibilities between networking and virtualization teams.
There are many good reasons one would use Nexus 1000V, but the one above is definitely not one of them.
How do you capture all the flows entering or exiting a data center if your core Nexus 7000 switch cannot do it in hardware? You take an x86 server, load nProbe on it, and connect the nProbe to an analysis system built with ELK stack… at least that’s what Clay Curtis did (and documented in a blog post).
A recent well-publicized network outage prompted someone to start collecting fat-finger horror stories, and dozens of networking engineers were quick to chime in. Enjoy!
A while ago someone working for an IT-focused media site approached me with a short list of high-level questions. Not sure when they’ll publish the answers, so here they are in case you might find them interesting:
What can enterprises do to ensure that their infrastructure is ready for next-gen networking technology implementations emerging in the next decade?
Next-generation networks will probably rely on existing architectures and forwarding mechanisms, while being significantly more uniform and heavily automated.
A while ago Chris Young sent me a few questions about network management in the brave new SDN world. I never focused on network management, but I know a few people who do, including Terry Slattery and Matt Oswalt. Interop brought us all together, and we sat down one evening after the presentations to chat about the challenges of monitoring and managing SDN networks.
We started with easy things like comparing monitoring results from virtual and physical switches (and why they’ll never match and do we even care), and quickly diverted into all sorts of potential oscillations caused by overly-dynamic load balancing caused by flow label-based ECMP and flowlets.
PF_RING is a great open-source project that enables extremely fast packet processing on x86 servers, so I was more than delighted when Paolo Lucente of the pmacct fame introduced me to Luca Deri, the author of PF_RING.
When we started chatting, we couldn’t resist mentioning ntopng, another open-source project Luca is working on.
Christoph Jaggi, the author of Metro Ethernet and Carrier Ethernet Encryption Market Overview published an awesome follow-up document: an evaluation guide that lists most of the gotchas one has to be aware of when considering encryption gear, from deployment scenarios, network overhead and key exchange details to operational considerations. If you have to deal with any aspect of network encryption, this document is a must-read.
In a world ruled by OpenFlow you’d expect the OpenFlow controller to know all the traffic; in more traditional networks we use technologies like NetFlow, sFlow or IPFIX to report the traffic statistics – but regardless of the underlying mechanism, you need a tool that will collect the statistics, aggregate them in a way that makes them usable to the network operators, report them, and potentially act on the deviations.
On the very same day that I published the CLI is Not the Problem post I stumbled upon an interesting discussion on the v6ops mailing list. It all started with a crazy idea to modify BGP to use 128-bit router ID to help operators that think they can manually configure large IPv6-only networks without any centralized configuration/management authority that would assign 32 bit identifiers to their routers.
The “Is CLI In My Way … or a Symptom of a Bigger Problem” post generated some interesting discussions on Twitter, including this one:
One would hope that we wouldn’t have to bring up this point in 2014 … but unfortunately some $vendors still don’t get it.
Every time I plug a new device into my Windows laptop and it automatically discovers the device type, installs the driver, configures the devices, and tells me it’s ready for use, I wonder why we can’t have get the same level of automation in networking.
Consider, for example, a well-known vSphere link failover issue: if you forget to enable portfast on server-facing switch ports, some VMs lose connectivity for up to 30 seconds every time a switch reloads.
You know how hard it is to get the network traffic statistics: interface counters are too coarse, Netflow records are too granular, Sflow is sampling… life is hard for network monitoring Goldilocks.
In the Network Monitoring video (part of Real-Life OpenFlow Use Cases webinar) I explained an interesting alternative: you could get (hardware permitting) traffic counters with ever OpenFlow flow entry, resulting in any granularity you need.