network management « ipSpace.net blog

Wednesday, March 12, 2025 07:51 +0100

Identify Changes in Router Configurations

If you’ve ever had to manage and configure more than a few routers in a production environment, there probably was a moment when you had to figure out what changes were made to a device configuration.

Answering that question seems to be an easy task; after all, device configurations are just text files:

Periodically collect device configurations and store them somewhere (shared disk, database, or source code repository like Git)
Whenever you have to figure out what changed, run a utility like diff to identify changes in text files.

Log Changes to Router Configurations

Whenever you’re faced with an “unexpected” network outage that doesn’t seem to be caused by a hardware failure, the root cause often tends to be a change in a device configuration, raising these questions:

What changes were made to the device configuration?
When were the changes made?
Who made them?

Worth Exploring: Akvorado Flow Collector and Visualizer

The results you can get when you know how to apply proper glue to a bunch of open-source tools never cease to amaze me. The latest entrant in that category: Akvorado, a Netflow/IPFIX collector and analyzer by Vincent Bernat.

Some of the sample graphs (shown in the GitHub repo) are not far off from those that knocked our socks off during the first Kentik Networking Field Day presentation. Definitely a tool worth exploring ;)

see 1 comments

network management

Sunday, March 21, 2021 08:06 UTC

Worth Reading: Splitting the Ping

I hope you’re aware that the venerable ping (and most of its variants) measures round-trip-time – how long it takes to get to the destination and back – but is there a way to measure one-way latency or find out asymmetric transit times?

Ben Cox found a way to use ICMP timestamps together with reasonably accurate NTP-derived time to do just that. More details in Splitting the ping (HT: Drew Conry-Murray).

add comment

network management

Friday, December 18, 2020 06:36 UTC

Streaming Telemetry with Avi Freedman on Software Gone Wild

Remember my rant how “fail fast, fail often sounds great in a VC pitch deck, and sucks when you have to deal with its results”? Streaming telemetry is no exception to this rule, and Avi Freedman (CEO of Kentik) has been on the receiving end of this gizmo long enough to have to deal with several generations of experiments… and formed a few strong opinions.

Unfortunately Avi is still a bit more diplomatic than Artur Bergman – another CEO I love for his blunt statements – but based on his NFD16 presentation I expected a lively debate, and I was definitely not disappointed.

Enjoy the podcast

see 2 comments

Friday, June 12, 2020 06:34 UTC

SuzieQ with Dinesh Dutt and Justin Pietsch on Software Gone Wild

In early May 2020 I wrote a blog post introducing SuzieQ, a network observability platform Dinesh Dutt worked on for the last few years. If that blog post made you look for more details, you might like the Episode 111 of Software Gone Wild in which we went deeper and covered these topics:

How does SuzieQ collect data
What data is it collecting from network devices
What can you do with that data
How can you customize and extend SuzieQ

Listen to the podcast

add comment

Thursday, June 4, 2020 08:53 UTC

Interesting: Measuring End-to-End Latency in Web Browser

CloudFlare launched yet another service: transfer speed- and latency measurements done from a web browser. While it’s pretty obvious how you could measure transfer speed (start an asynchronous transfer, register for the JavaScript onreadystatechange event to notice out when it has completed, and compute the transfer rate), measuring latency seems like a bit of black magic. After all, you can’t do a ping from a web browser, can you?

What If... There Would Be an Easy Way to Run Your Network

Imagine a life where you would be able to…

Find all interfaces that have VRRP configured but no useful VRRP neighbor;
Find all OSPF adjacencies that should be up but are not;
Get an alert every time the default IP route is lost;
Find all MTU mismatches in your network;
List all VXLAN-to-VLAN mappings across your data center, and find if two different VLANs map into the same VXLAN VNI;
Compare IP routes in your data center to those you had yesterday;
Verify that IP routing tables on all spine switches contain the same prefixes;
Do the same comparison before and after a software upgrade;
Identify changes in IP routing tables or ARP tables that happened between yesterday evening and this morning;

… and be able to do all that in a multi-vendor environment without writing tons of Ansible playbooks or Python code.

Using Elastic Stack in Networking and Security

Andrea Dainese is continuing his journey through open-source NetDevOps land. This time he decided to focus on log management systems, chose Elastic Stack, and wrote an article describing what it is, why a networking engineer should look at it, and what’s the easiest way to start.

add comment

Saturday, April 18, 2020 08:32 UTC

Interesting: Easy Analytics with Elastic Stack

Adrian Giacometti described how he used Elastic Stack (ELK) to build a dashboard for his integration tests and network logs.

Maybe it’s time to build our own network monitoring systems from open-source components instead of paying vendors big bucks for slick PowerPoint slides.

add comment

Monday, January 7, 2019 08:31 +0100

Large Layer-2 Domains Strike Again…

I started January 2018 blogging with a major service provider failure. Why should 2019 be any different? Here’s what Century Link claimed was causing two-day outage (more comments here).

Supposedly it was a problem with the management network used by their optical gear, but it looks a lot like a layer-2 network spanning 15 data centers and no control-plane policing on the managed devices… proving yet again that large-scale layer-2 networks are a really bad idea.

Streaming Telemetry: View from the Trenches

I asked David Gee to review my streaming telemetry blog posts to make sure I didn’t make too many blunders, and he sent me a nice summary of his view on the topic in return.

The only thing I could do after reading it was to ask him for permission to do a copy-paste. Here it is:

Streaming Telemetry Standards: So Many to Choose From

Continuing the Streaming Telemetry saga, let’s focus on presentation formats and transport mechanisms.

I already mentioned three presentation formats: XML (used by NETCONF), JSON (used by RESTCONF) and Protocol Buffers (used by gRPC). Two of them are text-based, the third one (Protocol Buffers) is binary encoding not unlike ASN.1 BER used by SNMP. That can’t be good in a JSON-hyped world, right?

Model-Driven Telemetry Isn’t as New as Some People Think

During the Campus Evolution with Cat9K presentation (I hope I got it right - the whole event was an absolute overload) the presenter mentioned the benefits of brand-new model-driven telemetry, which immediately caused me to put my academic hat on and state that we had model-driven telemetry for at least 30 years.

Don’t believe me? Have you ever looked at an SNMP MIB description? Did it look like random prose to you or did it seem to have some internal structure?

Brief Recap: Tech Field Day at Cisco Live Europe 2018

I don’t think I’ve ever been at a Tech Field Day event that’s been as intense as what we went through in the last few days at Cisco Live Europe – at least 17 different presentations in two days. It’s still all a blur and will take a long while to sort out.

First impressions:

Category: network management