Blog Posts in August 2009
Someone really wants to hear my opinion on SCTP (RFC 4960); he’s added a “what about SCTP” comment to several Internet-related posts I wrote in the last weeks. So, here are my totally unqualified (I have no hands-on experience) thoughts about SCTP.
Let me reiterate: I’m taking a 30,000-foot perspective here and whatever I’m writing could be completely wrong. If that’s the case, please point out my mistakes in your comments.
From the distance, the protocol looks promising. It provides datagram (unreliable messages), reliable message (record) and stream transport. Even more important, each connection can run across multiple IP addresses on each endpoint, providing native support for scalable IP multihoming (where each multihomed host resides in multiple PA address blocks from various Service Providers).
It looks like we’re bound to experience a widespread BGP failure once every few months. They all follow the same pattern:
- A “somewhat” undertested BGP implementation starts advertising paths with “unexpected” set of attributes.
- A specific downstream BGP implementation (and it could be a different implementation every time) a few hops down the road hiccups and sends a BGP notification message to its upstream neighbor.
- BGP session must be reset following a notification message; the routes advertised over it are lost and withdrawn, causing widespread ripples across the Internet.
- The offending session is reestablished seconds later and the same set of routes is sent again, causing the same failure and a session reset. If the session stays up long enough, some of the newly received routes might get propagated and will flap again when the session is reset.
- The cyclical behavior continues until a manual intervention.
Clue started an interesting discussion on the NANOG mailing list. He’s inherited a network that extended its internal OSPF to its multihomed customers and wondered whether he should leave the network as it is, change OSPF to IS-IS or deploy BGP. Here are a few thoughts from my reply.
Please remember that we were discussing running global OSPF with the customer routers. Running OSPF in a VRF is a different story, as the customer cannot impact another customer’s routing (they can only burn your CPU cycles).
You might think that the lack of a decent session layer in the TCP/IP protocol suite is the main culprit for our reliance on IP multihoming and related explosion of the IP routing tables. Unfortunately, we have an even bigger problem: the Berkeley Socket API, which is around 40 years old and used in almost all TCP/IP software implementations and clients (including high-level scripting languages like PERL or Python).
If you’ve ever tried to get advanced Cisco certifications, you’ve probably encountered questions dealing with the mismatch between the end device ARP timeouts and the L2 switch CAM (MAC address cache) timeouts. If you’re still wondering what the underlying problem is (it took me a while to figure it out), read the Unicast Flooding in Switched Campus Networks document from Cisco.
In all scenarios, traffic sent to unknown unicast MAC address causes layer-2 flooding, which can significantly reduce switch performance. Microsoft took this problem to a completely new level with its Network Load Balancing implementation: Windows servers send ARP replies containing MAC address X from MAC address Y, causing all the traffic toward the servers to be flooded – effectively turning an Ethernet switch into a hub.
I’ve sent a link to my Filter excessively prepended AS-paths article as an answer to a BGP route-map question to the NANOG mailing list and got several interesting questions from Dylan a few hours later. As they are pretty common, you might be interested in them as well.
In my environment, we are not doing full routes. We have partial routes from AS X and then fail to AS Y. Is their any advantage for someone like me to do this, as we are not providing any IP transit so we are not passing the route table to anyone else?
One of the biggest challenges facing the Internet core today is the explosion of the IP routing and forwarding tables, which is caused primarily by traffic engineering and multihoming requirements. Things were supposed to get better when IPv6 introduced strict hierarchical addressing (similar to the phone number addressing, where the first few digits always denote the country code).
Unfortunately, the hierarchical IPv6 addressing idea relied on incredible belief that the world will shape itself according to the wills of the IETF working group members. Not surprisingly, that didn’t happen and the hierarchical IPv6 addressing idea was quietly scrapped, giving us plenty more prefixes to play with when trying to pollute the global IPv6 routing tables.
The discussions following my “All-I-can-eat mentality” post have helped me get a much better understanding of the broadband access business issues. I’ve already shared some of them in a follow-up post. A few weeks later (just before leaving for my summer vacation) I’ve tried to provide as balanced perspective as I could manage in the “Broadband traffic management: Finding rational solutions to ease congestion” article I wrote for SearchTelecom.
My favorite yellow press outlet has decided to propagate hearsay instead of writing “original contributions” (but their mastery of creating sensationalistic titles remains unchallenged). This time, they claim that “New features embedded in Cisco IOS like VoIP and Web services can present an opportunity for hackers”.
The only supporting documentation they provide is a story in SearchSecurity with a sensationalistic title (New Cisco IOS bugs pose tempting targets, says Black Hat researcher) followed by two pages of confusion including gems like “… new deployments of Internet Protocol version 6 (IPv6) and VoIP installations may make router exploitation more vulnerable to remote attackers …” supported by “… IPv6 was considered a security threat due to the many net tunnels used to connect to IPv6 …”, which, as anyone who has some basic clue about IPv6 knows, has nothing to do with router vulnerabilities.
Fortunately, Blackhat is a serious undertaking (unlike conferences with grandiose titles like “Cyber Infrastructure Protection”) and provides its presentations online for anyone to see what the presenters were really discussing. I would strongly recommend that you check the excellent “Router Exploitation” presentation by Felix Lindner, which provides a very reasonable analysis of current situation: the routers are exploitable (no surprise there), but it’s very hard to do. Not surprisingly, SNMP is mentioned only once, IPv6 in passing and VoIP only twice (with a good recommendation: don’t run VoIP on your core routers).
The Friday’s OSPF quiz has generated numerous answers … unfortunately many of them incorrect. Some readers (probably those that recently attended a Cisco certification exam) thought I was asking a trick question, as I’ve forgotten to include the IP addresses in the sample configuration, which only proves how hard it is to write good bulletproof questions.
Those that assumed the IP addresses would have to be configured on the interfaces made two common errors:
- Some assumed a type-2 LSA would be generated for the LAN interface. Wrong: type-2 LSA is generated only if needed (there is more than one router attached to the LAN interface).
- Others thought the router would generate a type-1 LSA per interface. Wrong: an OSPF router generates only a single type-1 LSA per area.
To clarify these issues, I wrote an article documenting how the type-1 (router) LSA describes various interface types and inter-router links.