Blog Posts in June 2009
Based on the ADSL reference model we’ve discussed last week, let’s try to figure out how you can influence the quality of service over your ADSL link (for example, you’d like to prioritize VoIP packets over web download). To understand the QoS issues, we need to analyze the congestion points; these are the points where a queue might form when the network is overloaded and where you can reorder the packets to give some applications a preferential treatment.
Remember: QoS is always a zero-sum game. If you prioritize some applications, you’re automatically penalizing all others.
The primary congestion point in the downstream path is the PPPoE virtual interface on the NAS router (marked with a red arrow in the diagram below), where the Service Provider usually performs traffic policing. It’s better from the SP perspective to police the traffic @ NAS than to send all the traffic to DSLAM where it would be dropped in the ATM hardware. Secondary congestion points might arise in the backhaul network (if the network is heavily oversubscribed) and in DSLAM (if the NAS policing does not match the QoS parameters of the ATM virtual circuit).
Shahid wrote me an e-mail asking about local command authorization. He would like to perform it within the AAA model, but while AAA local authorization works, it only allows you to specify user privilege level (and autocommand), not individual commands (like you can do on a TACACS+ server).
I’m looking for a touch screen device that would work (well) with PowerPoint. I’d like to start drawing my diagrams with a pen, not with a mouse; I have a completely unfounded irrational belief that drawing with a pen might be faster and easier than using a mouse. Any (tested) ideas?
The Cisco Subnet RSS feed I’m receiving from Network World contained interesting information a few days ago: Cisco has reissued the HTTP security advisory from 2005. The 2005 bug was “trivial”: they forgot to quote the “<” character in the output HTML stream as “<” and you could thus insert HTML code into the router’s output by sending pings to the router and inspecting the buffers with show buffers assigned dump (I found the original proof-of-concept exploit on the Wayback Machine). However, I’ve checked the behavior on 12.4(15)T1 and all dangerous characters (“<” and quotes) were properly quoted. So, I’m left with two explanations.
We all know that the global BGP table is exploding (see the Active BGP entries graph) and that it will eventually reach a point where the router manufacturers will not be able to cope with it via constant memory/ASIC upgrades (Note: a layer-3 switch is just a fancy marketing name for a router). The engineering community is struggling with new protocol ideas (for example, LISP) that would reduce the burden on the core Internet routers, but did you know that we could reduce the overall BGP/FIB memory consumption by over 35% (rolling back the clock by two and a half years) if only the Internet Service Providers would get their act together.
Take a look at the weekly CIDR report (archived by WebCite on June 22nd), more specifically into its Aggregation summary section. The BGP table size could be reduced by over 35% if the ISPs would stop announcing superfluous more specific prefixes (as the report heading says, the algorithm checks for an exact match in AS path, so people using deaggregation for traffic engineering purposes are not even included in this table). You can also take a look at the worst offenders and form your own opinions. These organizations increase the cost of doing business for everyone on the Internet.
Why is this behavior tolerated? It’s very simple: advertising a prefix with BGP (and affecting everyone else on the globe) costs you nothing. There is no direct business benefit gained by reducing the number of your BGP entries (and who cares about other people’s costs anyway) and you don’t need an Internet driver’s license (there’s also no BGP police, although it would be badly needed).
Fortunately, there are some people who got their act together. The leader in the week of June 15th was JamboNet (AS report archived by Webcite on June 22nd) that went from 42 prefixes to 7 prefixes.
What can you do to help? Advertise the prefixes assigned to you by Internet Registry, not more specific ones. Check your BGP table and clean it. Don’t use more specific prefixes solely for primary/backup uplink selection.
A while ago I’ve received an interesting question from someone studying for the CCNP certification: “I know it’s not necessary to configure clns routing if I’m running IS-IS for IP only, but isn’t IS-IS running over CLNS?”
I’ve always “known” that IS-IS uses a separate layer-3 protocol, not CLNP (unlike IP routing protocols that always ride on top of IP), but I wanted to confirm it. I took a few traces, inspected them with Wireshark and tried to figure out what’s going on.
You might be confused by the mixture of CLNS and CLNP acronyms. From the OSI perspective, a protocol (CLNP) is providing a service (CLNS) to upper layers. When a router is configured with clns routing it forwards CLNP datagrams and does not provide a CLNS service to a transport protocol. The IOS configuration syntax is clearly misleading.
It turns out the whole OSI protocol suite uses the same layer-2 protocol ID (unlike IP protocol suite where IP and ARP use different layer-2 ethertypes) and the first byte (NLPID) in the layer-3 header to indicate the actual layer-3 protocol. I was not able to find any table of layer-3 OSI protocol types, so I had to experiment with Wireshark to figure out the values for CLNP, ES-IS and IS-IS (yes, these three are distinct L3 protocols).
I’m getting lots of ADSL QoS questions lately, so it’s obviously time to cover this topic. Before going into the QoS details, I want to make sure my understanding of the implications of the baroque ADSL protocol stack is correct.
In the most complex case, a DSL service could have up to eight separate components (including the end-user’s workstation):
- End-user workstation sends IP datagrams to the local (CPE) router.
- CPE router runs PPPoE session with the NAS (Network Access Server) and sends Ethernet datagrams to the DSL modem.
- DSL modem encapsulates Ethernet frames in RFC 1483 framing, slices them in ATM cells and sends them over the physical DSL link to DSLAM.
- DSLAM performs physical level concentration and sends the ATM cells (one VC per subscriber) into the network.
- The backhaul network (DSLAM to NAS) could be partly ATM based. The ATM cells could thus pass through several ATM switches.
- Eventually the ATM cells have to be reassembled into PPPoE frames. In a worst-case scenario, an ATM-to-Ethernet switch would perform that function.
- The backhaul network could be extended with Ethernet switches.
- Finally, the bridged PPPoE frames arrive @ NAS which terminates the PPPoE session and emits the IP datagrams into the IP core network.
It was (around) 1995, everyone was talking about ATM, but very few people knew what they were talking about. I was at Networkers (way before they became overcrowded Cisco Live events) and decided to attend the ATM Executive Summary session, which started with (approximately) this slide …
… and the following explanation:
As you know, a duck can swim, but it's not as fast as a fish, walk, but not run as a cheetah, and fly, but it's far from being an eagle. And ATM can carry voice, data and video.
The session continued with a very concise overview of AAL types, permanent or switched virtual circuits and typical usages, but I’ve already got the summary I was looking for … and I’ll remember the duck analogy for the rest of my life. Whenever someone mentions ATM, the picture of the duck appears somewhere in the background.
If you’re trying to explain something very complex (like your new network design) to people who are not as embedded into the problem as you are, try to find the one core message, make it as simple as possible, and build around it.
Swapnendu was trying to implement inter-VRF route leaking in multi-VRF environment without using route targets. He decided to use inter-VRF static routes, but got concerned after reading the following paragraph from Cisco’s documentation:
You can not configure two static routes to advertise each prefix between the VRFs, because this method is not supported. Packets will not be routed by the router. To achieve route leaking between VRFs, you must use the import functionality of route-target and enable Border Gateway Protocol (BGP) on the router. No BGP neighbor is required
Every few months, my good friend Jeremy finds a reason to write another post against bandwidth throttling and usage-based billing. Unfortunately, all the blog posts of this world will not change the basic fact (sometimes known as the first law of thermodynamics): there is no free lunch. Applied to this particular issue:
The “GRE keepalives or EIGRP hellos” discussion has triggered another interesting question:
Is there a good rule-of-thumb for setting hold-down timers in respect to the bandwidth/delay of a given link? Perhaps something based off of the SRTT?
Routing protocol hello packets or GRE keepalive packets are small compared to the bandwidths we have today and common RTT values are measured in milliseconds while the timers' granularity is usually in seconds.
A few months ago, a small ISP was able to disrupt numerous BGP sessions in the Internet core by prepending over 250 copies of its AS number to the outbound BGP updates. While you should use the bgp maxas-limit command to limit the absolute length of AS-path in the inbound updates, you might also want to drop all excessively prepended BGP paths.
The Filter excessively prepended BGP paths article in the CT3 wiki describes the AS-path access list you can use to drop any BGP prefix that has more than X consecutive copies of the same AS number.
It looks like everyone who’s not using DMVPN is running IPSec over GRE these days, resulting in interesting questions like »should IP use EIGRP hellos or GRE keepalives to detect path loss?«
Any dedicated link/path loss detection protocol should be preferred over tweaking routing protocol timers (at least in theory), so the PC answer is »use GRE keepalives and keep EIGRP hellos at their default values«.
BFD would be the perfect solution, but it's not working over GRE tunnels yet ... and based on its past deployment history in Cisco IOS years will pass before we'll have it on the platforms we usually deploy at remote sites.
I had an interesting debate with an engineer who wanted to use TFTP between a router and a server reachable through an outside interface. He realized that he needed to configure (application-level) TFTP packet inspection for router-generated traffic, but unfortunately Cisco IOS does not support this particular combination.
His query prompted me to read the TFTP RFC, which clearly documents that the data packets sent by the server are coming from a different UDP port number (thus the need for application-level inspection). The results of my tests are available in the TFTP server protection with Context-Based Access Control (CBAC) article.
A few days ago, City College of New York hosted the “Cyber Infrastructure Protection Conference”, including a keynote speech by Krishnan Sabnani who described “new class of denial-of-service (DOS) attacks that threaten wireless data networks” … or so the Network World claims in its article.
The conference web site is only accessible through an IP-address-only URL http://220.127.116.11/ (which immediately triggered suspicions in my browser) and the presentations are not available on-line, so I cannot comment on what mr. Sabnani actually told the participants, but the summary provided by Network World is 80% hot air. Here’s their list of “five wireless data network threats outlined by Sabnani”:
I've got the following question from Matthew: »how would one go about matching the default route for filtering using standard ACLs?«
In all routing protocols but EIGRP (which can carry the »default candidate« flag on any IP prefix), the default route has IP address 0.0.0.0 and subnet mask 0.0.0.0.
To match the default route with a standard ACL, use access-list x permit 0.0.0.0. To match it with an extended ACL (which matches the IP address and the subnet mask portions), you have to use access-list y permit ip host 0.0.0.0 host 0.0.0.0. And finally, to match the default route in a prefix list, use ip prefix-list z permit 0.0.0.0/0.
Everyone studying the EIGRP details knows the “famous” composite metric formula, but the recommendation to keep the K values intact (or at least leaving K2 and K5 at zero) or the inability of EIGRP to adapt to changing load conditions is rarely understood.
IGRP, the EIGRP’s predecessor, had the same vector metric and very similar composite metric formula, but it was a true distance vector protocol (like RIP), advertising its routing information at regular intervals. The interface load and reliability was thus regularly propagated throughout the network and so it made sense to include them in the composite metric calculation (although this practice could lead to unstable or oscillating networks).
EIGRP routing updates are triggered only by a change in network topology (interface up/down event, IP addressing change or configured bandwidth/delay change) and not by change in interface load or reliability. The load/reliability numbers are thus a snapshot taken at the moment of the topology change and should be ignored.
Sending EIGRP updates whenever there’s a significant change in load or reliability would be technically feasible, but would diminish the benefits of replacing distance vector behavior with DUAL.
You might be wondering why Cisco decided to include the load and reliability into the EIGRP vector metric. The total compatibility of IGRP and EIGRP vector metrics allowed them to implement smooth IGRP-to-EIGRP migration strategy with automatic propagation of vector metrics in redistribution points, including the IGRP-EIGRP-IGRP redistribution scenario used in IGRP-to-EIGRP core-to-edge migrations.
A while ago I wrote a series of IP Corner articles describing the router configuration management features introduced in IOS releases 12.3(14)T and 12.4. To give you an overview of those features, I’ve prepared the Router configuration management tutorial in the CT3 wiki. It contains feature-by-feature introductions and links to relevant IP corner articles.
A few weeks ago, a member of the NANOG mailing list asked an interesting question: is it OK for a host to have two physical interfaces in the same IP subnet (obviously with two different IP addresses)?
The follow-up discussion uncovered an interesting fact: although such a configuration is quite unusual in the modern IP world, it’s explicitly permitted by RFC 1122. The discussion also exposed numerous problems you might experience when trying to deploy this design on a Linux host as well as some misconceptions about the source IP addresses in TCP and UDP packets.
I’ve summarized the topologies defined in RFC 1122, IP addressing rules pertinent to TCP/UDP clients and servers as well as problems you might experience in the Multihomed IP hosts article in the CT3 wiki.
An anonymous reader left an interesting comment on my post Sample configuration: periodic upload of router configuration. Instead of configuring file prompt quiet to avoid prompts generated by the copy running-config URL command, he recommended using show running-config | redirect URL.
The solution is almost perfect, but includes two extra lines in the router configuration …
Current configuration : xxxxxx bytes
… that you’d better remove before using the configuration on another router. The more system:running-config | redirect URL command removes even this minor glitch and can be used in both kron commands or EEM applets.
Last week I’ve incorrectly claimed that EIGRP cannot set the IP next-hop of an advertised route to a third-party router. That hasn’t been true since IOS release 12.3 which introduced the no ip next-hop-self eigrp as-number interface configuration command.
With the EIGRP next-hop processing enabled, the next-hop field is set in the outbound EIGRP updates if the next hop router belongs to the IP subnet of the outgoing interface. This functionality works well in NBMA networks (including DMVPN tunnels) as well as in route redistribution scenarios.