Build the Next-Generation Data Center
6 week online course starting in spring 2017

ADSL QoS basics

Based on the ADSL reference model we’ve discussed last week, let’s try to figure out how you can influence the quality of service over your ADSL link (for example, you’d like to prioritize VoIP packets over web download). To understand the QoS issues, we need to analyze the congestion points; these are the points where a queue might form when the network is overloaded and where you can reorder the packets to give some applications a preferential treatment.

Remember: QoS is always a zero-sum game. If you prioritize some applications, you’re automatically penalizing all others.

The primary congestion point in the downstream path is the PPPoE virtual interface on the NAS router (marked with a red arrow in the diagram below), where the Service Provider usually performs traffic policing. It’s better from the SP perspective to police the traffic @ NAS than to send all the traffic to DSLAM where it would be dropped in the ATM hardware. Secondary congestion points might arise in the backhaul network (if the network is heavily oversubscribed) and in DSLAM (if the NAS policing does not match the QoS parameters of the ATM virtual circuit).

In the upstream direction, the congestion occurs on the DSL modem – the path between the CPE and the modem (Ethernet or Fast Ethernet) is much faster than the upstream ATM virtual circuit. Secondary congestions might occur in DSLAM or the backhaul network. NAS usually does not police inbound traffic, as it’s assumed the DSL access network already limits the user traffic to its contractual upstream speed.

Based on the congestion analysis, it’s obvious you cannot use queuing on the CPE (marked “2” in the diagrams) to influence the ADSL QoS as you don’t control a single congestion point. You have to use traffic shaping on the CPE to introduce artificial congestion points in which the queues will form. You can then use the usual queuing mechanisms to prioritize the application traffic.

The shaping configured on the PPPoE interface on the CPE router neatly removes the congestion on the DSL modem. The backhaul network is rarely congested in the upstream direction (unless your friendly neighbors are devoted fans of P2P protocols).

When configuring the upstream shaping rate, you just have to take in account the extra overhead introduced by the PPPoE framing, which is not yet present in packets shaped on the Dialer interface, and reduce the upstream shaping speed to a value slightly below your DSL upstream speed.

If your DSL configuration uses PPPoE Dialer interface, you have to shape the traffic on the Dialer interface, not the outside Ethernet interface. The outside Ethernet interface transmits PPPoE-encapsulated IP traffic on which you cannot use the IP-based queuing classifiers.outside Ethernet interface.

Assuming most of your traffic is TCP-based (or that all non-TCP traffic is prioritized), the shaping on the inside LAN interface will cause enough TCP delays to slow down the downstream TCP transmission. However, it’s harder to determine the correct shaping rate and optimize the shaping behavior when the high-priority traffic is not present; we’ll cover these issues in an upcoming post.

There is no local command authorization

Shahid wrote me an e-mail asking about local command authorization. He would like to perform it within the AAA model, but while AAA local authorization works, it only allows you to specify user privilege level (and autocommand), not individual commands (like you can do on a TACACS+ server).

One of the reasons for this behavior is the difference between exec authorization (the authorization to start the interactive session, configured with aaa authorization exec) and command authorization (the authorization to execute a particular command, configured with aaa authorization commands). While the local method can be specified in the aaa authorization commands command, it’s essentially a no-op (it always succeeds). Using the local method in the aaa authorization commands is only meaningful if you want to provide a fallback mechanism where all commands are authorized if the router cannot contact a TACACS+ server.

You can use EEM applets, command privilege levels or parser views to limit the set of commands a user can execute on a router without using TACACS+ command authorization.

This article is part of You've asked for it series.

Help appreciated: touch-screen drawing

I’m looking for a touch screen device that would work (well) with PowerPoint. I’d like to start drawing my diagrams with a pen, not with a mouse; I have a completely unfounded irrational belief that drawing with a pen might be faster and easier than using a mouse. Any (tested) ideas?

IOS HTTP vulnerability

The Cisco Subnet RSS feed I’m receiving from Network World contained interesting information a few days ago: Cisco has reissued the HTTP security advisory from 2005. The 2005 bug was “trivial”: they forgot to quote the “<” character in the output HTML stream as “&lt;” and you could thus insert HTML code into the router’s output by sending pings to the router and inspecting the buffers with show buffers assigned dump (I found the original proof-of-concept exploit on the Wayback Machine). However, I’ve checked the behavior on 12.4(15)T1 and all dangerous characters (“<” and quotes) were properly quoted. So, I’m left with two explanations.

It’s real

Someone has discovered a really devious way of inserting HTML code that somehow bypasses the quoting process. It could be weird Unicode encoding of less-than character, similar to the IPS vulnerability I’ve been writing about two years ago. I couldn’t find a feasible approach to do it, as the original attack vector (show buffers command) drops the high-order bit from the dumped data and the IOS HTTP server properly quotes 7-bit characters, but then I’m not aware of every IOS command (including the hidden ones) that could dump buffer/memory data. I’ve even tested the 0xFF3C sequence produced by tclsh and it does not work (the 0xFF is emitted unchanged, but the 0x3C is quoted).

It’s an administrative blunder

The “Revision history” section of the advisory claims that they’ve revised the workaround section, which describes how to disable the HTTP WEB_EXEC service. If this is true, they might have updated the list of affected software and fixed IOS versions. Adding information on feature available in 12.3T four years after the original advisory without fixing other more relevant information is (in my opinion) pure paperwork shuffling, not to mention the scare caused by an advisory claiming there’s a security hole in all classic IOS releases.

What should you do?

To be on the safe side, you should:

  • Disable HTTP and HTTPS servers in Cisco IOS unless you absolutely need them (but you should do that anyway).

Protecting the HTTP server with an ACL does not help, as the exploit works through the administrator’s browser.

Disabling the WEB_EXEC service will break SDM.

  • Use dedicated browser sessions when accessing the router. Start a new copy of the browser (or even better, a different browser), go to the router, do what you have to do and close all browser windows before accessing anything else, including links in your e-mail.

Last but not least, you could disable individual commands with EEM applets (if only Cisco would provide a complete list of vulnerable commands). For example, the following command will disable all variants of show buffers command with the dump option

event manager applet WebDeny
 event cli pattern "show buffers.*dump" sync no skip yes

Internet anarchy: I’ll advertise whatever I like

We all know that the global BGP table is exploding (see the Active BGP entries graph) and that it will eventually reach a point where the router manufacturers will not be able to cope with it via constant memory/ASIC upgrades (Note: a layer-3 switch is just a fancy marketing name for a router). The engineering community is struggling with new protocol ideas (for example, LISP) that would reduce the burden on the core Internet routers, but did you know that we could reduce the overall BGP/FIB memory consumption by over 35% (rolling back the clock by two and a half years) if only the Internet Service Providers would get their act together.

Take a look at the weekly CIDR report (archived by WebCite on June 22nd), more specifically into its Aggregation summary section. The BGP table size could be reduced by over 35% if the ISPs would stop announcing superfluous more specific prefixes (as the report heading says, the algorithm checks for an exact match in AS path, so people using deaggregation for traffic engineering purposes are not even included in this table). You can also take a look at the worst offenders and form your own opinions. These organizations increase the cost of doing business for everyone on the Internet.

Why is this behavior tolerated? It’s very simple: advertising a prefix with BGP (and affecting everyone else on the globe) costs you nothing. There is no direct business benefit gained by reducing the number of your BGP entries (and who cares about other people’s costs anyway) and you don’t need an Internet driver’s license (there’s also no BGP police, although it would be badly needed).

Fortunately, there are some people who got their act together. The leader in the week of June 15th was JamboNet (AS report archived by Webcite on June 22nd) that went from 42 prefixes to 7 prefixes.

What can you do to help? Advertise the prefixes assigned to you by Internet Registry, not more specific ones. Check your BGP table and clean it. Don’t use more specific prefixes solely for primary/backup uplink selection.

Autocommands in AAA environment

A reader who prefers to remain anonymous has reported an interesting observation: autocommands configured on local usernames do not work after configuring aaa new-model.

I’ve immediately suspected that the problem lies in the granularity of the AAA mechanisms and a quick lab test proved it: the username/password check is configured with the aaa authentication login configuration commands, whereas the autocommand feature belongs to the EXEC authorization and has to be configured separately with the aaa authorization exec command.

The following configuration can be used if you want to use local usernames and autocommands within the AAA framework (add TACACS+/RADIUS servers as needed):

aaa new-model
aaa authentication login default local 
aaa authorization exec default local
username local password 0 local
username test password 0 test
username test autocommand show ip route

This article is part of You've asked for it series.

IS-IS is not running over CLNP

A while ago I’ve received an interesting question from someone studying for the CCNP certification: “I know it’s not necessary to configure clns routing if I’m running IS-IS for IP only, but isn’t IS-IS running over CLNS?”

I’ve always “known” that IS-IS uses a separate layer-3 protocol, not CLNP (unlike IP routing protocols that always ride on top of IP), but I wanted to confirm it. I took a few traces, inspected them with Wireshark and tried to figure out what’s going on.

You might be confused by the mixture of CLNS and CLNP acronyms. From the OSI perspective, a protocol (CLNP) is providing a service (CLNS) to upper layers. When a router is configured with clns routing it forwards CLNP datagrams and does not provide a CLNS service to a transport protocol. The IOS configuration syntax is clearly misleading.

It turns out the whole OSI protocol suite uses the same layer-2 protocol ID (unlike IP protocol suite where IP and ARP use different layer-2 ethertypes) and the first byte (NLPID) in the layer-3 header to indicate the actual layer-3 protocol. I was not able to find any table of layer-3 OSI protocol types, so I had to experiment with Wireshark to figure out the values for CLNP, ES-IS and IS-IS (yes, these three are distinct L3 protocols).

You can find all the details (including the comparison of OSI and IP protocol stacks) in the IS-IS in OSI protocol stack article in the CT3 wiki.

This article is part of You've asked for it series.

ADSL reference diagram

I’m getting lots of ADSL QoS questions lately, so it’s obviously time to cover this topic. Before going into the QoS details, I want to make sure my understanding of the implications of the baroque ADSL protocol stack is correct.

In the most complex case, a DSL service could have up to eight separate components (including the end-user’s workstation):

  1. End-user workstation sends IP datagrams to the local (CPE) router.
  2. CPE router runs PPPoE session with the NAS (Network Access Server) and sends Ethernet datagrams to the DSL modem.
  3. DSL modem encapsulates Ethernet frames in RFC 1483 framing, slices them in ATM cells and sends them over the physical DSL link to DSLAM.
  4. DSLAM performs physical level concentration and sends the ATM cells (one VC per subscriber) into the network.
  5. The backhaul network (DSLAM to NAS) could be partly ATM based. The ATM cells could thus pass through several ATM switches.
  6. Eventually the ATM cells have to be reassembled into PPPoE frames. In a worst-case scenario, an ATM-to-Ethernet switch would perform that function.
  7. The backhaul network could be extended with Ethernet switches.
  8. Finally, the bridged PPPoE frames arrive @ NAS which terminates the PPPoE session and emits the IP datagrams into the IP core network.

I sincerely hope no network is as complex as the above diagram. In most cases, the backhaul would be either completely ATM-based …

… or Ethernet based (when the DSLAM has Ethernet uplink interface):

The NAS could also be adjacent to DSLAM or even integrated in the same chassis.

Am I missing anything important? I know you could deploy numerous additional devices (for example, Cisco is promoting the Service Exchange Framework and Service Control Engine), but these devices would be placed deeper into the IP core.

ATM is like a duck

It was (around) 1995, everyone was talking about ATM, but very few people knew what they were talking about. I was at Networkers (way before they became overcrowded Cisco Live events) and decided to attend the ATM Executive Summary session, which started with (approximately) this slide …

… and the following explanation:

As you know, a duck can swim, but it's not as fast as a fish, walk, but not run as a cheetah, and fly, but it's far from being an eagle. And ATM can carry voice, data and video.

The session continued with a very concise overview of AAL types, permanent or switched virtual circuits and typical usages, but I’ve already got the summary I was looking for … and I’ll remember the duck analogy for the rest of my life. Whenever someone mentions ATM, the picture of the duck appears somewhere in the background.

If you’re trying to explain something very complex (like your new network design) to people who are not as embedded into the problem as you are, try to find the one core message, make it as simple as possible, and build around it.

Inter-VRF static routes

Swapnendu was trying to implement inter-VRF route leaking in multi-VRF environment without using route targets. He decided to use inter-VRF static routes, but got concerned after reading the following paragraph from Cisco’s documentation:

You can not configure two static routes to advertise each prefix between the VRFs, because this method is not supported. Packets will not be routed by the router. To achieve route leaking between VRFs, you must use the import functionality of route-target and enable Border Gateway Protocol (BGP) on the router. No BGP neighbor is required

There is no reason why inter-VRF static routes on point-to-point interfaces would not work. However … if Cisco's documentation states something is not supported, that's exactly what it is: not supported. It might work for you, it might not work on specific platforms and it might be broken in a future software release (like MPLS VPN on 1800 routers). You're using it at your own risk and if it stops working you can't even complain to the TAC (because they'll tell you it's unsupported).

Internet Socialism: All-I-can-eat mentality

Every few months, my good friend Jeremy finds a reason to write another post against bandwidth throttling and usage-based billing. Unfortunately, all the blog posts of this world will not change the basic fact (sometimes known as the first law of thermodynamics): there is no free lunch. Applied to this particular issue:

  • Any form of fixed Internet pricing is effectively an “all-you-can-eat” buffet. Such a buffet works as long as the visitors’ stomach sizes have comparable capacity. In the high-speed Internet world a torrent user can consume two or three orders of magnitude more resources than a regular user.
  • In an environment where a minority of users consumes most of the resources, you’re simply forced to treat the large consumers differently. Otherwise, you’re forcing the majority to pay for the excesses of the few and the majority will eventually revolt (which is why the big socialist experiments didn’t work).
  • Obviously you need to upgrade your network as the average use increases, but being forced to upgrade due to a few large consumers and distributing the costs across the whole customer base simply does not make sense (not to mention the fact that providing Internet connectivity is far away from being a lucrative business).

Unfortunately, the basic facts are usually obscured by controversies like companies choosing PR disasters over fixing their networks or Service Providers incompetent enough to call port scanning a DOS attack. It’s also highly unreasonable to expect the users to consume less than 5GB a month and charge any over-the-quota traffic without any safeguards.

There are technical solutions (for example, Cisco’s SCE) that allow the Service Providers to give each user a fair share of the bandwidth (or even limit the number of TCP/UDP sessions in a time period). However, without end-users and bloggers adopting a realistic view of the world, we’re facing a lose-lose scenario. Whatever the Service Providers do, however much they might invest in their network (and charge everyone), however reasonable their throttling/capping decisions might be, the all-I-can-eat zealots will cry foul … or am I yet again completely wrong?

Recommendations for keepalive/hello timers

The “GRE keepalives or EIGRP hellos” discussion has triggered another interesting question:

Is there a good rule-of-thumb for setting hold-down timers in respect to the bandwidth/delay of a given link? Perhaps something based off of the SRTT?

Routing protocol hello packets or GRE keepalive packets are small compared to the bandwidths we have today and common RTT values are measured in milliseconds while the timers' granularity is usually in seconds.

OSPF and IS-IS support for fast hellos is an exception, but you wouldn’t want to use this feature on a hub router with tens or hundreds of small remote sites.

You should answer the above question by asking yourself: what are my business needs for a fast switchover and how can I get there? If you’re satisfied with a switchover that takes a few (up to ten) seconds, you can achieve it with keepalive/hello packets. If you need a faster switchover, you will have to do serious routing protocol tuning or use MPLS TE fast reroute.

Filter excessively prepended BGP paths

A few months ago, a small ISP was able to disrupt numerous BGP sessions in the Internet core by prepending over 250 copies of its AS number to the outbound BGP updates. While you should use the bgp maxas-limit command to limit the absolute length of AS-path in the inbound updates, you might also want to drop all excessively prepended BGP paths.

The Filter excessively prepended BGP paths article in the CT3 wiki describes the AS-path access list you can use to drop any BGP prefix that has more than X consecutive copies of the same AS number.

GRE keepalives or EIGRP hellos?

It looks like everyone who’s not using DMVPN is running IPSec over GRE these days, resulting in interesting questions like »should IP use EIGRP hellos or GRE keepalives to detect path loss?«

Any dedicated link/path loss detection protocol should be preferred over tweaking routing protocol timers (at least in theory), so the PC answer is »use GRE keepalives and keep EIGRP hellos at their default values«.

BFD would be the perfect solution, but it's not working over GRE tunnels yet ... and based on its past deployment history in Cisco IOS years will pass before we'll have it on the platforms we usually deploy at remote sites.

The reality is a bit different: although EIGRP hellos and GRE keepalives use small packets that are negligible compared to today's link bandwidths, enabling GRE keepalives introduces yet another overhead activity. On the other hand, the GRE keepalive overhead is local to the router on which you’ve configured them (the remote end performs simple packet switching), whereas both ends of the tunnel are burdened with frequent EIGRP hello packets.

If you need to detect the path loss on the remote sites (to trigger the backup link, for example), GRE keepalives are the perfect solution. EIGRP timers are left unchanged and the overhead on the central site is minimal.

If your routing design requires the central site to detect link loss, there’s not much difference between the two methods. However, due to the intricacies of the EIGRP hello protocol, improving neighbor loss on the central site requires hello timer tweaking on the remote sites. It’s probably easier to configure GRE keepalives on the central site routers than to reconfigure all remote sites.

Last but not least, do not forget that GRE keepalives do not work under all circumstances.

This article is part of You've asked for it series.

TFTP server protection with CBAC

I had an interesting debate with an engineer who wanted to use TFTP between a router and a server reachable through an outside interface. He realized that he needed to configure (application-level) TFTP packet inspection for router-generated traffic, but unfortunately Cisco IOS does not support this particular combination.

His query prompted me to read the TFTP RFC, which clearly documents that the data packets sent by the server are coming from a different UDP port number (thus the need for application-level inspection). The results of my tests are available in the TFTP server protection with Context-Based Access Control (CBAC) article.

Read the whole article in the CT3 wiki

This article is part of You've asked for it series.

New wireless DOS attacks? … Maybe not.

A few days ago, City College of New York hosted the “Cyber Infrastructure Protection Conference”, including a keynote speech by Krishnan Sabnani who described “new class of denial-of-service (DOS) attacks that threaten wireless data networks” … or so the Network World claims in its article.

The conference web site is only accessible through an IP-address-only URL (which immediately triggered suspicions in my browser) and the presentations are not available on-line, so I cannot comment on what mr. Sabnani actually told the participants, but the summary provided by Network World is 80% hot air. Here’s their list of “five wireless data network threats outlined by Sabnani”:

  1. DOS attacks on Mobile IP. Possible. I don’t know enough about Mobile IP to comment on this item.
  2. Battery drain on mobile phones triggered by continuous stream of packets sent by an intruder. Hilarious.
  3. Peer-to-peer applications. Some Service Providers get real problems (and PR headaches) from them, but classifying them as a “new class of DOS attack” is creative.
  4. Malfunctioning cards. So 1990’s (OK, we were fighting low-cost Ethernet NICs then).
  5. Excessive port scanning. So what? This is news?

It looks like some 3G Service Providers have only now started to grasp the intricacies of the environment we had to live in for the last 15 years. Welcome to Internet. It’s fast, it’s cheap, it’s ubiquitous, but not always nice.

As for the source of this ingenious list, we’ll probably forever wonder: was it really presented at the conference or was it another journalistic success?

Quick tip: Matching default route in a standard ACL

I've got the following question from Matthew: »how would one go about matching the default route for filtering using standard ACLs?«

In all routing protocols but EIGRP (which can carry the »default candidate« flag on any IP prefix), the default route has IP address and subnet mask

To match the default route with a standard ACL, use access-list x permit To match it with an extended ACL (which matches the IP address and the subnet mask portions), you have to use access-list y permit ip host host And finally, to match the default route in a prefix list, use ip prefix-list z permit

This article is part of You've asked for it series.

EIGRP load and reliability metrics

Everyone studying the EIGRP details knows the “famous” composite metric formula, but the recommendation to keep the K values intact (or at least leaving K2 and K5 at zero) or the inability of EIGRP to adapt to changing load conditions is rarely understood.

IGRP, the EIGRP’s predecessor, had the same vector metric and very similar composite metric formula, but it was a true distance vector protocol (like RIP), advertising its routing information at regular intervals. The interface load and reliability was thus regularly propagated throughout the network and so it made sense to include them in the composite metric calculation (although this practice could lead to unstable or oscillating networks).

EIGRP routing updates are triggered only by a change in network topology (interface up/down event, IP addressing change or configured bandwidth/delay change) and not by change in interface load or reliability. The load/reliability numbers are thus a snapshot taken at the moment of the topology change and should be ignored.

Sending EIGRP updates whenever there’s a significant change in load or reliability would be technically feasible, but would diminish the benefits of replacing distance vector behavior with DUAL.

You might be wondering why Cisco decided to include the load and reliability into the EIGRP vector metric. The total compatibility of IGRP and EIGRP vector metrics allowed them to implement smooth IGRP-to-EIGRP migration strategy with automatic propagation of vector metrics in redistribution points, including the IGRP-EIGRP-IGRP redistribution scenario used in IGRP-to-EIGRP core-to-edge migrations.

Router configuration management

A while ago I wrote a series of IP Corner articles describing the router configuration management features introduced in IOS releases 12.3(14)T and 12.4. To give you an overview of those features, I’ve prepared the Router configuration management tutorial in the CT3 wiki. It contains feature-by-feature introductions and links to relevant IP corner articles.

Multihomed IP hosts

A few weeks ago, a member of the NANOG mailing list asked an interesting question: is it OK for a host to have two physical interfaces in the same IP subnet (obviously with two different IP addresses)?

The follow-up discussion uncovered an interesting fact: although such a configuration is quite unusual in the modern IP world, it’s explicitly permitted by RFC 1122. The discussion also exposed numerous problems you might experience when trying to deploy this design on a Linux host as well as some misconceptions about the source IP addresses in TCP and UDP packets.

I’ve summarized the topologies defined in RFC 1122, IP addressing rules pertinent to TCP/UDP clients and servers as well as problems you might experience in the Multihomed IP hosts article in the CT3 wiki.

This article is part of You've asked for it series.

Avoid the prompts generated by the COPY command

An anonymous reader left an interesting comment on my post Sample configuration: periodic upload of router configuration. Instead of configuring file prompt quiet to avoid prompts generated by the copy running-config URL command, he recommended using show running-config | redirect URL.

The solution is almost perfect, but includes two extra lines in the router configuration …

Building configuration...
Current configuration : xxxxxx bytes

… that you’d better remove before using the configuration on another router. The more system:running-config | redirect URL command removes even this minor glitch and can be used in both kron commands or EEM applets.

Final correction: EIGRP next hop processing

Last week I’ve incorrectly claimed that EIGRP cannot set the IP next-hop of an advertised route to a third-party router. That hasn’t been true since IOS release 12.3 which introduced the no ip next-hop-self eigrp as-number interface configuration command.

With the EIGRP next-hop processing enabled, the next-hop field is set in the outbound EIGRP updates if the next hop router belongs to the IP subnet of the outgoing interface. This functionality works well in NBMA networks (including DMVPN tunnels) as well as in route redistribution scenarios.

The details are described in the EIGRP next-hop processing article in the CT3 wiki.

Thanks again to alvarezp for pointing out my mistake.