Your browser failed to load CSS style sheets. Your browser or web proxy might not support elliptic-curve TLS

Building network automation solutions

6 week online course

Start now!

Online presentations: technical details

Danny sent me an interesting question about my “Market trends in Service Provider networks” presentation: “I have never attended or seen this sort of webinar event, is this purely web based with video or will it seem like a WebEx type meeting presentation?

Obviously “webinar” was a wrong name. I hate “e-learning” (tells you about as much as “cloud computing”), “Internet-based lecture” sounds awful, and I obviously chose yet another meaningless buzzword.

Here’s what’s going to happen: it will be a live presentation delivered over the Internet using WebEx with one-way voice delivered through VoIP. You’ll see the slides and additional markup that I’ll put on them, but not my talking head … not (only) because I look ugly but (primarily) because the talking head just eats the bandwidth.

read more see 3 comments

ITU and Internet billing: It’s a déjà-vu all over again

My friend Stretch alerted me to an article published by BBC news which reports that “an EU cyber security expert” told “a House of Lords committee” (wow, that’s the perfect body to deal with Internet issues) that the proposal submitted by Chinese to an ITU-T study group required “modifications to BGP” which would “threaten the stability of the entire Internet”.

Regardless of whatever the original proposal has been, the information has been distorted, twisted, adapted, abstracted and misunderstood so many times before being published that it’s impossible to figure out what exactly has been going on. The account of an eyewitness (sitting in the Kampala talks in September) doesn’t tell much more.

read more see 2 comments

Send a SNMP trap from an EEM applet

The engineer who wanted to detect specific DoS attack (WAN link overload) with EEM applet asked for something more in his original question: he wanted to receive a SNMP trap on the NMS when the DoS attack is detected. Implementing this requirement with an EEM applet is simple; you just need to add the trap keyword to the event manager applet configuration command.


EEM-SNMP integration is described in the Embedded Event Manager (EEM) workshop. You can attend an online version of the workshop; we can also organize a dedicated event for your networking team.

read more see 1 comments

Next-generation IP services

A while ago I’ve created a short presentation describing modern IP- and web-based services. It describes the application-layer topics I’ve been focusing on in the last few years, from cloud computing to web-based applications. I've tried to keep it simple enough that someone without the prior knowledge of the field would not get lost after two slides, but still far away from high-level marketing nonsense (you can get plenty of that anywhere else). Today I finally found some time to spend on the paperwork and wrote the description of the Next-generation IP Services tutorial.

see 3 comments

IPv6 is not ready for residential deployment

The main driver for IPv6 deployment is the IPv4 address space exhaustion, caused primarily by fast growth of residential users.

Each residential user needs an IP address, a small company doesn’t need anything more and even a reasonably-sized company can survive with a few IP addresses.

One would expect the vendor readiness to follow this pattern, but the situation is just the opposite: while the enterprise networking devices have pretty good IPv6 support (Data Center components from some vendors are a notable exception), the vendors serving the residential market don’t care.


The Service Provider-related IPv6 challenges are covered in my Market trends in Service Provider networks workshop. You can attend a web-based tutorial version or we can organize a dedicated workshop event for your team.

read more see 6 comments

HQF: truly hierarchical queuing

After doing the initial tests of the HQF framework, I wanted to check how “hierarchical” it is. I’ve created a policy-map (as before) allocating various bandwidth percentages to individual TCP/UDP ports. One of the classes had a child service policy that allocated 70% of the bandwidth to TCP and 30% of the bandwidth to UDP (going to the same port#), with fair queuing being used in the TCP subclass.

Short summary: HQF worked brilliantly.

read more see 4 comments

Help appreciated: Best time for web-based sessions

I’m planning to deliver my workshops/tutorials as web-based sessions in the next year. One of the very important aspects (apart from having good content at a reasonable price) is the timing: I’m trying to adapt to the preferences of as many potential visitors as possible.

Could you help me figure out what the best time would be? Let me know what works for you by participating in this poll (but please do so only if you plan to attend one of the sessions).

Add comment

Book review: Cisco Routers for the Desperate

If you happen to be one of those “universal engineers” tasked with configuring a Cisco router just because you deployed a web site yesterday, you’re probably already searching for a book in the Dummies series. Once your desperation exceeds a certain threshold, you might consider the “Cisco Routers for the Desperate”.

The idea is great: give someone who hasn’t seen Cisco IOS CLI before enough knowledge to perform the basic tasks. The writing style is surprisingly good and the book is filled with well-explained printouts you might get from the router. Looks like a perfect book for the task … if only it wouldn’t be hopelessly outdated.

read more see 4 comments

The sorry state of our industry

One of the few benefits of having a Facebook page is the ability to view the fan statistics (Facebook calls that Insights). I’ve just looked at the gender demographics of my fans and I was astounded; nothing has changed in the last 20 years. What are we doing wrong?



see 19 comments

Load sharing 101 (with references)

It looks like my load sharing posts (as well as related IP corner and wiki articles) did not paint the whole picture; I’m always assuming the readers have a basic level of IP routing knowledge (somewhere around BSCI/CCNP) and jump into juicy details. Let’s try to fix this error and start from the beginning.

A router receives its routing information (reachability of IP prefixes) from various sources: connected IP prefixes, static routes and dynamic routing protocols. For every IP prefix, the best source (= one with the lowest administrative distance) is selected and only the route(s) from that source are included in the IP routing table.

If the best source offers multiple equal-cost routes, more than one (up to the value of maximum-paths router configuration command) can be installed in the IP routing table and used for load sharing.

Things can get tricky when you use BGP, read the Load balancing in BGP networks IP corner article for details.

read more see 6 comments

Ten steps of small LAN design

Every so often someone tries to apply the “let all be friends and love each other” mentality to LAN networks and designs a pure layer-2 switched LAN (because it’s simpler). Jay contributed a ten-step “what happens next” description in his comment to my “Lies, damned lies and product marketing” post. The steps are so hilarious I simply had to repost them:

  1. Build everything at layer 2 because "it's simpler".
  2. Scale a little.
  3. Things start breaking mysteriously. Run around in circles. Learn about packet sniffers and STP.
  4. Learn about layer 3 features in switches you already own. Start routing.
  5. Scale more.
  6. Things start breaking mysteriously. Learn about TCAMs. Start wishing for NetFlow.
  7. Redesign. Buy stuff.
  8. Scale more.
  9. VMWare jockeys start asking about bridging across the WAN.
  10. Enroll in hair loss program.
see 1 comments

Lies, damned lies and product marketing

Greg Ferro’s “Layer-3 routing” post successfully kicked my huge sore spot: the numerous ways technical terminology is abused by product marketing gurus.

Twenty years ago, before networking became a multi-billion dollar industry, things were clear, simple and consistent: layer-2 (data-link layer) frame forwarding was bridging and layer-3 (network layer) packet forwarding was routing. Everything was crystal clear until some overly smart people tried to turn bridges into something they were not: WAN extension devices. A few large WAN networks were built with bridges … and failed spectacularly. Router vendors quickly used the opportunity to push the “routing is good, bridging is bad” mantra.

read more see 17 comments

Certifications and the hiring process

My good friend Stretch wrote an interesting article about the usability of certifications in the hiring process. I can’t agree more with everything he wrote about certifications, it nicely summarizes the various topics Greg Ferro and myself wrote about during the last year (please note: I’m not claiming Stretch was in any way influenced by our thoughts, anyone seriously considering the current certification processes has to come to the same conclusions).

Regrettably, I have to disagree with most of his alternative approach (although some of the ideas are great). It would work in an ideal world, but faces too many real-life obstacles in this one.

read more see 9 comments

iSCSI and SAN: two decades of rigid mindsets

Greg Ferro asked a very valid question in his blog: “Why does iSCSI use TCP as the underlying transport protocol”? It’s possible to design storage-focused protocol that uses connectionless lower layers (NFS comes to mind), but the storage industry has been too focused on their past to see past the artificial obstacles they’ve set up themselves.


Just in case you’re wondering where the slide is coming from: iSCSI and SAN are covered in the Data Center 3.0 for Networking Engineers webinar (register here).

It all started in 1981 with SCSI: a standard way of connecting storage to hosts with ribbon cables. They’ve made the first mistake when they’ve decided to replace ribbon cables with fiber: instead of developing a network-oriented protocol, they replaced physical layer (cable) with another physical layer and retained whatever was running on top of the ribbon cable (SCSI command protocol). If they wanted this approach to work, the new transport infrastructure had to have similar characteristics to the old one: almost no errors, no lost frames and guaranteed bandwidth. And thus Fiber Channel was born.

read more see 5 comments

IPv6 in the campus but not in the Data Center?

Cisco has recently published two excellent design guides: Deploying IPv6 in Campus Networks and Deploying IPv6 in Branch Networks. As expected from the engineers writing Cisco’s design documents, these guides contain tons of useful information and good recommendations; they’re a highly recommended reading if you’ve started considering IPv6 deployment in your enterprise network. These design guides are part of Design Zone for Branch and Design Zone for Campus.

IPv6 deployment issues are just one of the topics covered in the Enterprise IPv6 Deployment workshop. You can attend an online version of the workshop or we can organize a dedicated event for your team.

read more Add comment

Let me help you with your budget

I don’t know how your financial year works out; in Central Europe most companies’ financial year coincides with the calendar year. November and December are thus crazy planning/budgeting months, with pointy-haired bosses (myself included) trying to reduce expenses. Training is frequently one of the most exposed items and getting training funds outside of the budget is usually close to Mission: Impossible. Lesson: plan well what you want to spend in the next year.

Have I mentioned that I’m creating a number of WebEx-delivered workshops (we can also arrange on-site events)? Most of them are designed as high-level overview presentations (for whatever reason people think I can create a presentation that’s not too technical but still stays far away from High-Level-Bulls**t) and I would love to do a few really in-depth ones (assuming I can get enough participants). If you want to attend them in the next year, use the following NTE budgetary pricing:

  • €30 ($45)/hour/participant (two attendees @ 2-hour presentation = € 120);
  • €200 ($300)/hour for a private on-line workshop (up to 10 participants).
Add comment

Telepresence across half the globe (and over the Internet)

A few days ago we ran an interesting test: a Telepresence session from Slovenia (somewhere in Central Europe for those of you that cannot keep track of all the emerging minicountries) to US West Coast across the Internet. The session crossed numerous Service Providers and at least one VPN tunnel, but the quality was still amazingly good … you can watch it in this YouTube video.

see 6 comments

Two Day IPv6 Technical Workshop

We’re running a two day IPv6 technical workshop next week. Its contents are equivalent to the IPv6 Fundamentals course with “a bit” longer days and without the lab exercises. It’s ideal for those that need to understand the IPv6 technology but don’t yet need the hands-on experience.

Quick links: Course contents and registration.

read more Add comment

Dual-stack PPP requires two separate sessions?

A while ago a senior Service Provider network designer told me that they have serious issues with IPv6 deployment as IPv6 requires a separate PPPoE session from the CPE devices which significantly increases their licensing costs. The statement really surprised me; PPP was designed to be a multi-protocol environment and it’s very easy to configure IPv4 and IPv6 over a single PPP session in Cisco IOS. I think I might have tracked down the source of this “information” to the 6deploy IPv6 and DSL presentation which states on Page 11 that “Separate PPP sessions are established between the Subscriber’s systems (or CPE) and the BBRAS for IPv6 and IPv4 traffic”.

You will find an overview of the Service Provider-related IPv6 challenges in my Market trends in Service Provider networks workshop (register for the online webinar). Advanced backbone designs and configurations are explained in the Building IPv6 Service Provider core workshop (register for the online webinar).

Being too Cisco-centric, I cannot figure out whether this claim has any merit, as it clearly does not apply to Cisco IOS. Are there really boxes out there that are so stupid that they cannot run two protocols across a single PPPoE session?

see 6 comments

“ip ospf mtu-ignore” is a dangerous command

Two years ago I wrote about the problems caused by MTU mismatch between OSPF neighbors, and warned that the ip ospf mtu-ignore interface configuration command that supposedly solves the problem could cause significant headaches. Last week’s challenge was a simple illustration of what could happen if you force OSPF neighbors to establish a session even though their interface MTUs don’t match (the very first comment correctly identified the issue).

read more see 9 comments

Who needs IPv6?

One of the most common questions asked by our enterprise customers is “Who needs IPv6?” Since IPv6 does not add any significant new functionality (apart from larger address space), you can’t gain much by deploying it in an enterprise network … unless you’re huge enough that the private IPv4 address space (RFC 1918) becomes too confining for you. A good case study is Halliburton; you’ll find the details in Global IPv6 Strategies: From Business Analysis to Operational Planning book (my review).


The need for IPv6 deployment is one of the topics discussed in the Enterprise IPv6 Deployment workshop. You can attend an online version of the workshop or we can organize a dedicated event for your team.

read more see 5 comments

Detect short bursts with EEM

Last week I’ve described how you can use EEM to detect long-term interface congestion which could indicate denial-of-service attack. The mechanism I’ve used (the averaged interface load) is pretty slow; using the lowest possible value for the load-interval (30 seconds) it takes almost a minute to detect a DOS attack (see below).

If you want to detect outbound bursts, you can do better: you can monitor the increase in the number of output drops over a short period of time.


Obviously you cannot use this mechanism to detect inbound (potentially DOS-related) floods as the drops occur on the Service Provider’s edge router.

Various polling and averaging options, including in-depth discussion of hysteresis, are covered in the Embedded Event Manager (EEM) workshop. You can attend an online version of the workshop or we can organize a dedicated event for your networking team.

read more see 3 comments

Challenge: OSPF neighbor changing state from DOWN to DOWN

I’ve had an interesting discussion with Nicolas who optimized my OSPF neighbor loss EEM applet assuming the OSPF-5-ADJCHG message reports only OSPF neighbor state transitions from DOWN to FULL and from FULL to DOWN. I knew I'd seen stranger messages in my lab and was able to produce these ones after fumbling with OSPF configurations of two routers connected with a serial link:

*19:34:42.765: %OSPF-5-ADJCHG: Process 1, Nbr 10.0.1.1 on Serial1/0 from →
  EXSTART to DOWN, Neighbor Down: Too many retransmissions
*19:35:42.773: %OSPF-5-ADJCHG: Process 1, Nbr 10.0.1.1 on Serial1/0 from →
  DOWN to DOWN, Neighbor Down: Ignore timer expired

The messages are repeated approximately every three minutes (using the default OSPF timers).

Here's the challenge: what was going on and how was I able to produce these messages?

see 10 comments

HQF: intra-class fair queuing

Continuing from my first excursion into the brave new world of HQF, I wanted to check how well the intra-class fair queuing works. I’ve started with the same testbed and router configurations as before and configured the following policy-map on the WAN interface:

policy-map WAN
 class P5001
    bandwidth percent 20
    fair-queue
 class P5003
    bandwidth percent 30
class class-default
    fair-queue

The test used this background load:

ClassBackground load
P500110 parallel TCP sessions
P50031500 kbps UDP flood
class-default1500 kbps UDP flood
read more see 3 comments

Detect DOS attacks with EEM

Someone sent me an interesting question a while ago: “is it possible to detect DOS flooding with an EEM applet?” Of course it is (assuming the DOS attack results in very high load on the Internet-facing interface) and the best option is the EEM interface event detector.


Interface event detector is just one of the many topics covered in the Embedded Event Manager (EEM) workshop. You can attend an online version of the workshop or we can organize a dedicated event for your networking team.

read more see 5 comments

IPv6 in the Data Center: is Cisco ready?

With the recent Cisco’s push into the Data Center environment and all the (not so very unreasonable) fuss around IPv4 address depletion and imminent need for IPv6, I wanted to check whether an all-Cisco shop could do the first step: deploy IPv6 on Internet-facing production servers. If you follow the various design guidelines, your setup will have at least the following elements (and I bet someone from Cisco has already told you that you also need XML firewall, Ironport and WAAS appliance):


Now let’s see how well these boxes support IPv6.

I’m describing the Data Center IPv6 deployment issues in the Enterprise IPv6 Deployment workshop. The diagram above was taken straight from the workshop materials.

read more see 10 comments

First HQF impressions: excellent job

Several readers told me that the Hierarchical Queuing Framework introduced in IOS releases 12.4(20)T and 15.0 (why do I always have the urge to write 12.5?) works much better than CB-WFQ. After spending several hours trying to break HQF, I have to concur with them: Cisco’s engineers did a splendid job. However, the HQF behavior might be slightly counterintuitive to those that became too familiar with CB-WFQ.

read more see 5 comments

Solution: Bandwidth+Police actions in CB-WFQ

Most of the respondents to my last week’s challenge got it almost right. The minor (common) error was the assumption that police rate percent 50 would result in a TCP session getting 50% of the bandwidth. Eyal got that right: the TCP throughput is always significantly lower than that due to frequent drops caused by low burst sizes assumed by the police command and resulting TCP restarts (the most I was able to push through was around 90 kbps; half of the bandwidth would be 128 kbps).

Many respondents got the third case (bandwidth class, police class and default-class all active at the same time) wrong. Vaidotas was guessing in the right direction and Petr knows the correct answer, but did not want to spoil the fun. Here’s the surprising result: the bandwidth class gets almost all the bandwidth. Sometimes the TCP sessions in other classes wouldn’t even start.

read more see 5 comments

ITU: Grabbing a piece of the IPv6 pie?

ITU (the organization formerly known as CCITT) is having a bit of a relevance problem these days: its flagship technological achievements, including X.25, ISDN, ATM and SDH are dead or headed toward oblivion … and a former pariah, a group of geeks, is stealing the show and rolling out the Internet. No wonder their bureaucrats are having a hard time figuring out how to justify their existence. For years they’ve been lamenting how much they’ve contributed to the Internet (highly recommended reading for Monty Python fans) and how their precious contributions were unacknowledged. Now they came forth with a “wonderful” idea: the history of IPv4 address allocation proves that the wealthy nations and early adopters managed to grab disproportionate parts of the IPv4 address space (well, that’s true), so they made it their mission to protect the poor and underdeveloped countries in the brave new IPv6 world. In short, they want to become an independent address allocation entity (RIR). It looks like another worldwide bureaucracy is exactly what we need on top of all the other problems we have with IPv6 deployment.

read more see 4 comments

Overloaded …

Last week I’ve published two posts that deserve a follow-up/summary. Don’t worry, it’s coming. I’ve been extremely busy working on a customized version of the “Market trends in Service Provider networks” presentation that I’m delivering tomorrow … and I’ve managed to stumble across two “interesting” topics involving ITU; the first one is described in the next post and the second one needs a bit more investigation.

see 4 comments

Challenge: CB-WFQ Bandwidth+Police behavior

I have to admit I was somewhat surprised by the lab test results I’ve published in my previous CB-WFQ post. It looks like we’ve been fed misleading information about (classic) CB-WFQ behavior for years.

Don’t tell me that things are completely different with HQF implemented in IOS releases 12.4(late)T and 15.0. I know that … but 95+% of the installed base do not use those releases.

Let’s see whether you can figure out what my next lab test results showed. I’ve been running three parallel TTCP sessions on ports 5001, 5002 and 5003 across a 256 kbit point-to-point link. Here’s the relevant part of my router configuration:

read more see 11 comments

Off-topic: universal engineers

Decades ago when I was still in high school and working on a programming project during the summer break, an IT old-timer gave me the following bit of advice: “Remember, God created professions so that everyone could do the job he’s qualified to do”. It took me years before I understood what he had been trying to tell me, but this seems to be an industry-wide disease. Judging by some of the e-mails I receive a lot of people who are proficient in other IT specialties think they can configure the routers with a little help of good uncle Google and free support from fellow bloggers.

It seems the “ability” of a “generic” IT employee to tackle any problem somewhat related to IT is also unique to our industry. Last week a woodworker was installing my kitchen and flatly refused to connect the electric cable of the ceramic cooktop to the wall outlet citing potential liabilities (please remember: I’m not living in US but in Central Europe). An HTML programmer asked to configure the enterprise firewall might not be so reluctant. Why do you think some people in our industry believe they are universal engineers?

see 18 comments

CB-WFQ misconceptions

Reading various documents describing Class-Based Weighted-Fair-Queueing (CB-WFQ) one gets the impression that the following configuration …

class-map match-all High
 match access-group name High
!
policy-map WAN
 class High
  bandwidth percent 50
!
interface Serial0/1/0
 bandwidth 256
 service-policy output WAN
!
ip access-list extended High
 permit ip any host 10.0.3.1
 permit ip host 10.0.3.1 any

… allocates 128 kbps to the traffic to/from IP host 10.0.3.1 and distributes the remaining 128 kbps fairly between conversations in the default class.

I am overly familiar with weighted fair queuing (I was developing QoS training for Cisco when WFQ just left the drawing board) and was thus always wondering how they manage to implement that behavior with WFQ structures. A comment made by Petr Lapukhov re-triggered my curiosity and prompted me to do some actual lab tests.

The answer is simple: CB-WFQ does not work as advertised.

read more see 29 comments

Another successful IPv6 presentation

Last week I’ve delivered another very successful IPv6 Deployment in Enterprise Networks presentation. It’s amazing how far into the interesting details we usually get even though the presentation is purposefully a high-level one. This time we’ve been discussing the liabilities the Service Providers might get exposed to when using Large Scale NAT and whether the enterprise networks using HTTP proxies for Internet access could avoid the migration to IPv6.

If you’re working in a large enterprise network and think your IT team could benefit from this presentation or associated IPv6 workshop, get in touch with me.

see 1 comments

Book review: Foundation of Green IT

If you want to understand the real impact of the recent Data Center hype without getting pulled into the technology morass the vendors so copiously spread in their white papers, read the Foundation of Green IT book published by Prentice Hall. Its author, Marty Poniatowski, uses two case studies to illustrate enormous savings that can be realized through server and storage consolidation. I loved the first half of the book: the author avoids the technology issues (I loved the introduction to RAID: “I do not cover RAID background … the Internet has a wealth of information on RAID”) and uses real-life data gathered in actual project to illustrate the savings. Each case study has several chapters, ranging from starting point discovery through implementation plans and ROI analysis; exactly what you need if you’re considering going down the data center redesign path. The “Desktop Virtualization” and “Data Replication and Disk Technology Advancements” chapters are thrown in for good measure.

The author makes the server and storage consolidation case studies even more interesting by describing actual products/solutions and inserting screenshots of actual reports throughout the text. Not surprisingly, he’s describing what he knows best: HP servers, EMC storage and VMware virtualization; a clear indication how far Cisco has to go to win the hearts and minds of the data center market.

read more see 2 comments

Hierarchical Queueing Framework: queue limits and output drops

The QoS behavior in Cisco IOS has changed significantly with the introduction of the Hierarchical QoS Framework (HQF) in IOS release 12.4(20)T. Cisco is slowly producing in-depth articles describing the changes; the first one I’ve found documents the old and new output queue limits and output drops.

I’ve also added the link to this article to the Further reading section of my Queuing principles in Cisco IOS article.

see 1 comments

IPv6 labs are no longer available on Cisco’s Partner Education Connection

Csaba asked me about the availability of our IPv6 remote labs on Partner Education Connection. He wrote:

I was happy when I found this information, but after a while I realized that IPv6 labs are no more available on PEC site. Do you know any details why this topic has been removed from PEC or it can be that IPv6 is still there just I couldn't find it?

Cisco has removed almost all external remote labs from PEC in October. To increase the confusion, some of them might still appear in the catalog, but you cannot start them. The only thing we could do was to inform the users landing on our web site that the labs are no longer available and that they could buy them from our product catalog.

Add comment

IPv6-capable or IPv6-ready: is it enough?

During the IPv6 summit in Slovenia I’ve participated in a roundtable organized by our Ministry of Higher Education, Science and Technology. One of my points was that the government should require true IPv6 support in all its IT procurement processes to promote IPv6 adoption (I have to admit I’ve borrowed a few ideas from Geoff Huston’s “Is the Transition to IPv6 a Market Failure?” article) … and one of the participants (coming from the Service Provider industry) answered that “that’s common hygiene”. I’m not so sure.

Topics like this are covered in my Enterprise IPv6 Deployment workshop. Learn more about my workshops from my web site.

read more see 5 comments

Followup: What’s wrong with the Zone-Based Firewalls book

I’d like to thank all the readers that took time and responded to my question about the failure of my Deploying Zone-Based Firewalls book. The sad short conclusion is: while everyone would love to have an electronic copy of the book, the technology and the mindsets are simply not ready yet. Here are the details:

read more see 1 comments

What’s your take on Alcatel-Lucent IP+Optical integration?

Approximately a month ago Alcatel-Lucent launched Converged Backbone Transformation (are they sharing marketing wizards with Cisco … or is the excessive hype an industry-wide phenomenon?): a visionary(?) convergence of IP and optical technologies. If you haven’t heard about it yet, you could try to start with the IDC report published on Alcatel-Lucent’s web site (I’m always amazed how some people manage to tell so little in so many words).

Once you get past the fluff to the details, it could be that they're implementing a lot of common-sense. For example, it looks like the lambda-level grooming replaces GBIC/SFP transceivers with something that can generate multiple lambdas on the router and feed these lambdas directly into the DWDM gear. In my understanding, it replaces the GE port-GBIC-fiber-GBIC-GE port-lambda generation-DWDM chain with the shorter and cheaper GE port-lambda GBIC-fiber-lambda port-DWDM chain (obviously, I might be completely wrong; it’s hard to deduce the details from a press release).

Anyhow, I’d really appreciate your thoughts on this launch. Does it make sense? How does it compare to what Cisco and Juniper are doing? Is this a move in the right direction … or is Alcatel-Lucent playing a catch-up and trying to cover it with a grand marketecture?

see 7 comments

IOS packaging: Moore’s Law Won

Great news: Cisco launched a new series of midrange routers on Tuesday. They're very probably great products (I wouldn't expect less from Cisco). Also as expected, their marketing department couldn’t help itself (yet again) and had to position the launch as a universe-changing event: this time they Revealed the Borderless Network and spent loads of money producing “viral videos”. OK, maybe their average customer is stupid enough to fall for those tricks; I’m positive you’re not … so let’s see what’s really new (here's what Cisco admits is new after you've got past all the marketing fluff):

read more see 8 comments

Report interface loss based on OSPF neighbor loss

Nicolas sent me an interesting problem: he has numerous point-to-point GRE-over-IPSec tunnels on his core router and detects remote site failure with OSPF neighbor loss events. He would like to receive an e-mail when an OSPF neighbor goes down (quite easy to do with EEM), but would also like to receive interface description in the e-mail subject to simplify his troubleshooting.

With the regular expressions available in EEM 3.0 you can extract interface name from syslog message, execute show interface command and extract the interface description from it.

The EEM applet source code is available in the CT3 wiki

This article is part of You've asked for it series.

Add comment

Help appreciated: what’s wrong with my Zone-based Firewalls book?

A quick question for you: in two years since my Deploying Zone-based Firewalls digital short cut (marketese for downloadable PDF) was published, we’ve sold around 200 copies of it. Obviously we’re doing something wrong and I’d appreciate your opinion: is it the topic (are you using ZB firewall on Cisco IOS?), the format (would you prefer paper copy?), the platform (Cisco IOS as a firewall), pricing ($14.99 for 112 pages) or something else?

see 24 comments

What went wrong: TCP lives in the dial-up world

As expected, my “the socket API is broken” post generated numerous comments, many of them missing the point (for example, someone scolded me for quoting Wikipedia and not the official Linux documentation). I did not want to discuss the intricate technical details of the various incarnations of the API but the generic stupidity of having to deal with low-level networking details in the application.

Fabio was kind enough to provide the recommended method of using the Socket API from man getaddrinfo, effectively proving my point: why should every application use a convoluted function when all we want to do (in most cases) is connect to the server.

Patryk went even further and claimed that the socket API provides “basic functionality” and that libc is not the right place for anything more. Well, that mentality caused most of the IPv4-to-IPv6 application-related issues: obviously the applications developed before IPv6 was a serious consideration had to be rewritten because all the low-level code was embedded in the applications, not isolated in the library. A similar problem has effectively stalled SCTP deployment.

However, these are not the only problems we’re facing today. Even if the application properly implements the “try connecting to multiple addresses returned by DNS” function, the response time becomes unacceptable due to the default TCP timeout values coded in various operating systems’ TCP stacks.

For example, it takes up to three minutes for a TCP connect call to timeout on a Fedora-11 Linux distribution (the connect call aborts immediately if an intermediate router sends back an ICMP unreachable reply and the ARP timeout causes an abort in three seconds). Windows XP is slightly better; the default timeout is set at 20 seconds.

You might wonder what prompted the TCP designers to choose these exceedingly large values. TCP was designed more than 20 years ago when the analog dialup modems were commonly used to connect to the Internet. These modems could take a minute (or longer) to establish the connection and if you wanted to have a reliable TCP session setup, you had to wait significantly longer before aborting the session setup system call. The Internet has changed dramatically in the meantime, but nobody ever bothered changing the defaults.

If you want to rush and write a comment how the default can be changed, you’re yet again missing the point: we cannot implement multihomed IP hosts using more than one IP address due to the crazy default TCP timeout values. As soon as the first address becomes unreachable, the session establishment time (for an average user using out-of-box software) becomes unacceptable.

see 2 comments

IPv6 summit in Slovenia

The Slovenian IPv6 summit where I presented my views on IPv6 deployment in enterprise networks was a huge success, with over 130 attendees (not bad for a country with only 2M people) and Martin Levy from Hurricane Electric as the keynote speaker. The event (the program as translated by Google) was organized by fantastic people from the Go6 Institute. Unfortunately the go6 web site is only in Slovenian (try Google translator if you want to be highly amused or totally confused), but you can view my presentation, read the Hurricane Electric press release or browse the photos (I’m in the right-hand picture in the third row).

Someone took notes and posted them on the go6 web site. The notes for Martin’s keynote and my presentation are in English (just search for our names).

Last but not least, if you’re based in Europe and you’d like me to deliver my presentation at your local IPv6 event, send me a message.

see 1 comments

Fantastic DDoS protection: it’s getting worse

Last week I described the “beauty” I’d discovered through the NetworkWorld site: a solution that supposedly rejects DoS frames in 6 nanoseconds. Without having more details, I’ve tried hard to be objective and justify that you cannot get that performance in a best-case scenario (at least without having really expensive hardware and optimized architecture). In the meantime, one of the readers provided the name of the author of this discovery and I was able to find the original publication that was published in the Proceedings of the 2007 spring simulation multiconference by Society for Computer Simulation International.

read more see 1 comments

Follow-up: Interface default route

Judging by your comments, some of you have already faced a stupidity similar to the one I’ve described on Friday (BTW, I’ve remembered this particular debacle when receiving a Pingsta case invitation with very similar symptoms). The symptoms are well described in the comments: the CPU utilization of the ARP process increases, packet forwarding becomes sluggish and the router runs out of memory, potentially resulting in a router crash. Now let’s analyze what’s going on.

read more see 5 comments

My stupid moments: Interface default route

Years ago I was faced with an interesting challenge: an Internet customer was connected to our PE router with an Ethernet link and I did not want to include the PE router’s IP address in the default route on the CE router.

The latest IOS release in those days was probably somewhere around 11.x; none of the DHCP goodies were available.

After pondering the problem for a while, I got a brilliant idea: if I would use an interface default route, proxy-ARP would solve all my problems. This is the configuration I’ve deployed on the CE-router:

interface Ethernet 0
 description Uplink to the ISP
 ip address 10.0.1.2 255.255.255.0
!
ip route 0.0.0.0 0.0.0.0 ethernet 0

We tested this configuration in the middle of the night and it worked as expected. What do you think happened in the morning?

see 20 comments

DHCP client address change detector

In a previous post I’ve described how useless DHCP logging is when you try to detect change in DHCP-assigned IP address. Fortunately the removal of the old IP address (triggered by the DHCPNAK server response) and configuration of the new IP address (sent in the DHCPACK response) triggers a change in the IP routing table that can be detected with the new IP routing table event detector introduced in EEM 3.0 (available from Cisco IOS release 12.4(22)T).

The EEM applet that detects the changed DHCP-assigned IP address (including the initial address assignment which also triggers the syslog message) is described in the CT3 wiki.

see 2 comments

Will they ever start using their brains?

This morning I’ve discovered yet another journalistic gem. It started innocently enough: someone has announced prototype security software that blocks DDoS attacks. The fundamental idea (as explained in the article) sounds mushy: they’ve started with one-time user ID and introduced extra fields in the data packets. How can that ever scale in public deployment (which is where you’d be most concerned about a DDoS attack)?

But the true “revelation” came at the beginning of page 2: this software can filter bogus packets in 6 nanoseconds on a Pentium-class processor. Now let’s try to put this in perspective.

read more see 5 comments

SSH RSA authentication works in IOS release 15.0M

The feature we’ve begged, prayed, sobbed, yelled, screamed for has finally been implemented in Cisco IOS: public key SSH authentication works in IOS release 15.0M (and is surprisingly easy to use).

After configuring SSH server on IOS (see also comments to this post), you have to configure the ssh pubkey-chain, where you can enter the key string (from your SSH public key file) or the key’s hash (which is displayed by the ssh-keygen command).

It’s probably easier to copy/paste the public key from your id_rsa.pub file into the terminal window …

R2#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R2(config)#ip ssh pubkey-chain
R2(conf-ssh-pubkey)#username pipi
R2(conf-ssh-pubkey-user)#key-string
R2(conf-ssh-pubkey-data)#$AAQEA6jYlf9MBskhkWov+ZOUDKun0ExQIRj1zfWA/YciO02VS  
R2(conf-ssh-pubkey-data)#$XsxM7SqNkRSQOR7y7HBMoxTHV7o+R/uS6A8/mF0A3P/ScRjct  
R2(conf-ssh-pubkey-data)#$JrNGACGaFy1njD9PrrvrU4o4hx6XDr6xVXF4sP4OCSXIn+Cp8  
R2(conf-ssh-pubkey-data)#$bCnZLmv908AeDb1Ac4nPdsn1OhCPIg6fxZjB7DvAMB8Dbr+7Y  
R2(conf-ssh-pubkey-data)#$apEbGE94luIqnBc61HsMd6JCWbQ== [email protected]     
R2(conf-ssh-pubkey-data)#exit
R2(conf-ssh-pubkey-user)#^Z

… and let the router convert it into the key hash, which is stored in the configuration:

R2#show run | section ssh
ip ssh rsa keypair-name SSH
ip ssh version 2
ip ssh pubkey-chain
 username pipi
  key-hash ssh-rsa C20B739F2645D6850C591C6A11780CB5 [email protected]

After this simple step, you can log into your router without typing the password. Finally we have a manageable way of secure remote command execution.

see 18 comments

IOS release 15.0

This is not an April 1st post: I’ve just realized that Cisco quietly released IOS 15.0M (mainstream). Haven’t tested it yet, but the images for a large variety of platforms are already available on CCO. The new features listed in the documentation include:

  • Full BFD support, including static routes, BFD-in-VRF and BFD-over-Frame Relay (next step: test it on a 2800-series router);
  • DHCP authentication;
  • DMVPN tunnel health monitoring;
  • EEM 3.1 (whatever that is, the EEM documentation hasn’t been updated yet);
  • Interaction between IS-IS and LDP;
  • BGP local convergence in MPLS VPN networks (the feature has already been available in 12.2 SRC, now it’s available on more platforms);
  • OSPF graceful shutdown and OSPF TTL security check features are available on more platforms;
  • Intra-zone traffic inspection in zone-based firewall.

It looks like (as expected) the 15.0 release is a grand merge of all previous IOS trains (with a few extra features). Good job; finally we have something new to play with :)

see 18 comments

IOS fossils: Classful OSPF redistribution

In the classful days of the Internet it made sense to limit the amount of information redistributed between the routing protocols. OSPF was always classless, but RIPv1 wasn’t … and you could get all sorts of crazy routes from RIP that would mess up the rest of your network if they would ever get redistributed into OSPF. To prevent that, Cisco’s engineers introduced the subnets option in the OSPF redistribute command.

Either the OSPF redistribute command is really old (before the distribute-list command started accepting extended ACL which could filter on the subnet mask) or someone was too dumb to use the extended ACL and Cisco had to provide an obvious workaround.

By the time Cisco implemented EIGRP and BGPv4 (IOS release 9.21, 15+ years ago), the absurdity of the classful redistribution was already obvious. These routing protocols accept whatever routes you want to redistribute and their variants of the redistribute command don’t have the subnets keyword. However, nobody ever took steps to remove this fossil from the IOS code.

I wouldn’t care if we would be talking about an obscure option like the OSPF-to-BGP redistribution tags, but the OSPF redistribute command is one of the most confusingly harmful IOS routing protocol commands (the only one getting close in my opinion is the auto-summary in EIGRP). One of the IOS releases introduced a warning that would be printed whenever you’re configuring redistribution into OSPF without the subnets keyword (great job: wouldn’t it be better to fix the problem?) …

rtr(config)#router ospf 999
rtr(config-router)#redistribute static
% Only classful networks will be redistributed

… but the warning is incorrect. Yuri Selivanov sent me an interesting observation: while the subnets don’t get redistributed without the subnets keyword, supernets do. The “classful” filter is not even working correctly (or, at the very least, it’s not doing what the IOS claims).

I am positive Cisco will never fix this problem and we’ll have to cope with this command until the last bits of IPv4 code are erased from Cisco IOS, but here are a few things they could have done:

  • Remove the classful redistribution filter from the code and make the subnets keyword obsolete. I sincerely doubt any reasonably-sized network is running classful redistribution with recent IOS code (and who cares about the network cores running on AGS+).
  • Add another option to the OSPF process that disables the filter. This option would have to be explicitly stored in the router configuration like the log-adjacency-changes OSPF configuration command (to ensure the old configurations work) but should be turned on for all new instances of OSPF routing protocols.

Undoubtedly a few people (mostly old-timers) would get confused once or twice, but the number of OSPF support cases might be “slightly” reduced.

see 1 comments

Follow-up: P-to-P router encryption

The “P-to-P router encryption” post has generated numerous comments. One of the readers suggested using dedicated Ethernet encryption devices, which is probably the best option if you’ve realized you need encryption in the network acquisition phase when there’s still some budget left (too bad the vendor recommended in the comments does not want to admit how expensive the boxes are).

However, assuming you have high-speed IPSec encryption modules and you have to implement P-to-P encryption in existing network, the only option left to you is GRE tunnel. Here’s why:

IP and MPLS are distinct protocols at the boundary between layer-2 and layer-3 (let’s forget the question whether MPLS is a layer-3 or layer-2.5 protocol at the moment). If you want to transport both of them across a shared connection, you have to include the layer-2 protocol ID (L2-PID) in the payload to ensure the receiving router is able to distinguish between incoming IP packets (which need FIB lookup) and MPLS packets (which need LFIB lookup).

As described in the previous post, IPSec encryption works only on IP packets. If you want to transport multiprotocol data (IP and non-IP traffic) across an IPSec-encrypted session, you need to encapsulate them into IP datagrams using an encapsulation method that includes L2-PID. The only such method supported in Cisco IOS is GRE.

One of the readers suggested using Virtual Tunnel Interface (VTI). The VTI is just another conceptual implementation of IPSec (using point-to-point links instead of crypto maps). It does not introduce non-IPSec encapsulation headers and thus cannot transport L2-PID and multiprotocol data. Although you can configure mpls ip on the VTI interface, it doesn’t work.

GET VPN is completely unusable for our task; it’s a scalable way of key distribution and crypto map creation. It offers no encapsulation mechanism and works only for IP traffic.

Last but not least, someone started walking down the Rube Goldberg path and proposed adding extra boxes between the P-routers and using MPSL-over-GE-over-L2TPv3-over-IPSec-over-GE-(over-VPLS). The extra devices (your Cisco AM would love this option) and different encapsulation method (L2TPv3 instead of GRE) add unnecessary complexity … unless, of course, you cannot implement hardware encryption on your P-router, in which case it’s comparable to the “dedicated Ethernet encryption devices” proposal.

see 8 comments

DHCP logging in Cisco IOS is a nightmare

One of the readers sent me an interesting question: he’d like to know the IP address of his home router (to be able to connect to it from the office), but its IP address is assigned through DHCP and changes occasionally.

I wanted to solve the problem by hooking an EEM applet onto the DHCP-6-ADDRESS_ASSIGN syslog message. No good; as it turns out, Cisco IOS generates the logging message only when a DHCP-acquired IP address is assigned to an interface without one. If the IP address is changed via DHCP, the change is not logged.

One could understand the faulty programmers’ logic if the initial address assignment would be different from the address change, but DHCP is such a simple protocol that any change in client’s IP address requires the client to enter the INIT state, so acquiring a new IP address is no different from changing an existing one. I guess they had to take special precautions not to log the address change (and ensure we have another interesting challenge to chew on).

Fortunately, the IP routing table changes after every change in interface IP address … more about that in a few days.

see 6 comments

Deploying IPv6 in Enterprise Networks

I was invited to present my views on the IPv6 deployment in enterprise networks during the local IPv6 summit. Instead of joining the cheering few or the dubious crowds, I’m trying to present a realistic view answering questions like “what do I have to do”, “when should I start” and “where should I focus my efforts”.

Here’s the outline of my presentation, any feedback, additional thoughts or insightful critique is most welcome.

Background information

Scenario: Enterprise network connected to the Internet. No need for internal IPv6 (RFC 1918 is good enough).

Question: where shall I focus my IPv6 efforts?

Facts of life:

  • IPv6 is a reality, get used to it.
  • Migration is supposed to be easy, but you will get stuck on details.
  • Start small, but start now.

Phases of public IPv6 deployment:

Phase#1: Dual stack content (starting now)

Phase#2: IPv6-only Internet clients (in a few years)

Phase#3: IPv6-only major content providers (10+ years from now)

Obviously this is just my perception of the critical milestones, as they apply to enterprise network deployment.

Proposed action plan

Phase#1 has already started, get ready for it:

  • Establish IPv6 connectivity with all the upstream providers
  • Deploy IPv6 on your public servers. Start with small, non-critical applications to get hands-on experience.
  • Change your whole DMZ into dual-stack DMZ.

As an enterprise network, you don't care about Phase #2:

  • Your content is reachable over IPv4 and IPv6
  • Interesting content is reachable over IPv4.
  • Use this time to plan your internal IPv6 deployment.

When the public content becomes available only over IPv6 (phase #3) you might be in a morass if your internal network is not yet dual-stack (you’ll have to face ugly 4to6 NAT). Deploy dual-stack throughout your network:

  • RFC1918 + 4to4 NAT
  • Public IPv6 address space
see 2 comments

Encrypting P-to-P-router traffic

Rob sent me a really good question:

I have an enterprise MPLS network. Two P routers are connected via carrier point-to-point Gigabit Ethernet and I would like to encrypt the MPLS traffic traversing the GE link. The PE-routers don't have hardware crypto accelerators, so I would like to keep the MPLS within the buildings running in cleartext and only encrypt the inter-site (P-to-P) MPLS traffic.

The only solution I could imagine would nicely fit the motto of one of our engineers: »Any time you have a problem, use more GRE tunnels« (if you have a better solution, please post it in the comments).

As far as I understand Cisco's IPSEC implementation, IOS can encrypt only IP traffic. If you want to encrypt MPLS frames forwarded by a P-router, you have to convert them back to IP ... and the only way to convert MPLS frames back into IP without losing the whole label stack which is vital to proper operation of MPLS VPN is to encapsulate them in a GRE tunnel and encrypting the MPLS-over-GRE traffic.

The "obvious" problem of this setup (apart from performance hit) is the MTU issue: the MTU on the GE link has to be high enough to support all the extra overhead on top of the 1500-byte payload. If that's not the case, you have to lower the MTU on the tunnel interface to ensure the GRE packets are not fragmented - that would kill the receiving router.

There are also "a few" intricacies of using MPLS-over-GRE/IPSec when the P-router is a 6500 or 7600. Don't ask me what they are, I'm a conceptual person and try to stay away from the specific hardware implementations, but our engineers built several large networks using this architecture and got a lot of hands-on experience. If you feel you could benefit from their expertise, get in touch with our Professional Services team.

see 11 comments

The tunneling Kool-Aid

In a comment to my Data Center Interconnect article my friend Ronald (who once certified me as a Bovine Professional) wrote:

I don't drink this Cisco Kool Aid about interconnecting data centres using an IP backbone. Rather use FC directly over DWDM instead of FCIP on MPLS.

This time I could agree with him wholeheartedly ... assuming you already have DWDM gear (or infinite budget to buy some) and you can get dark fiber when and where you need it. Unfortunately not everyone is so lucky and/or rich, so we have to compromise.

His comment also reminded me of the comments I’ve heard from SNA gurus more than 15 years ago when Cisco introduced SDLC-over-IP and later LLC2-over-IP tunneling. We’ve spent years trying to persuade them to drink the DLSw Kool-Aid … and in the end they all gave up their precious SDLC leased lines and Front-End Processors as the market realities forced them to consolidate their infrastructure and migrate to routed networks. Not to mention that most of the gear they supported (including ATM machines) got replaced by newer models supporting IP (or even having IP as the only connectivity option).

A similar story played out in the Service Provider world, where the idea of offering Frame Relay or ATM service on top of an IP backbone was a heresy … until it became cheaper to migrate the legacy services to a router-based network than to pay the maintenance fees for the Frame Relay gear.

Moral of the story: whenever you propose to tunnel a well-established protocol using a new entrant in a particular technology area, people will tell you that it’s all a marketing plot and that they prefer to stick with the tested scenario … until the business pressures force them to change their mind. The tricky part is figuring out the right time to jump: if you’re too early, you could get burnt; if you’re too late, you might have lost a fortune.

Last but not least, don’t forget that not all tunneling scenarios were worth pursuing: IP-over-APPN never really took hold.

see 7 comments

Fishing for free information: the ultimate experience

A while ago the amount of queries I’ve been receiving has reached a threshold where I felt the need to be very honest about the type of questions I will answer (after all, we’re in business of providing networking-related services and if I want to continue blogging there has to be some revenue to pay the bills). Some people don’t mind and still send me requests for free information they need to implement the projects they’re paid to do. Recently I’ve got this shopping list …

My customer is asking me to do a reverse DNS entry in the DNS for his domain. But I do not have any DNS.

  1. What is the suggested DNS server hardware for an ISP?
  2. What is the procedure to be followed to inform the rest of the Internet that I own this pool?
  3. Do I need to register my company name and the allocated pool with any of IANA accredited registrars?
  4. Shall we use a single IP for registration or will our entire IP address pool be used?
  5. If I ask the registrar to make forward and reverse entry in their root server, do I also need to make the same entries in my DNS server?
  6. Is it mandatory to maintain master and slave DNS servers from the beginning? Do I need to allocate one of the allocated public IP to these servers and inform the register to make entries for this DNS server?

This is precisely the type of work that we’re best at: helping our clients plan and design their implementation, so I’ve politely offered our professional services. This is the reply I’ve received:

I appreciate your business mind. But I do feel, you need to spare some knowledge for free of cost like this.

Go figure …

see 10 comments

Carrier Ethernet service from customer’s perspective

As the Carrier Ethernet services are becoming more popular, people are starting to wonder how to use it in a router-based network. I’ve got the following question from one of my readers:

I was wondering if it was possible to design a redundant network where the core uses L2 MPLS, the provider edge uses L2 for access but the customer edge equipment uses L3 Routers. We don't want to customer to see any STP at their routers.

Of course you can do that. There are two scenarios to consider:

(A) The Service Provider is offering point-to-point Ethernet service (pseudowire). In this case, two of the customer routers would be connected with what looks like a point-to-point Ethernet link. Usually the remote site would have just one "outside" Ethernet connection while the hub site would have numerous links bundled in a trunked (VLAN) link.

(B) The SP is offering VPLS service. In this case, all customer routers appear as being connected to the same Ethernet segment.

In both cases, the customer edge (CE) routers should treat the SP Ethernet link as a simple LAN segment, in case (A) connecting two routers, in case (B) connecting many routers.

see 2 comments

Expired DHCP lease bounces the interface

You would think that an expired DHCP lease is not a big deal for a DHCP client. Although the interface IP address is lost, you can always try to get a new address from the DHCP server.

IOS has a different opinion: when the DHCP lease expires on a router configured with ip address dhcp interface configuration command, the interface is administratively shut down and re-enabled. Here’s a sample printout taken from a router running 12.4(24)T software:

%DHCP-5-RESTART: Interface FastEthernet0/1 is being restarted by DHCP
%LINK-5-CHANGED: Interface FastEthernet0/1, changed state to administratively down
%LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/1, changed state to down
%LINK-3-UPDOWN: Interface FastEthernet0/1, changed state to up
%LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/1, changed state to up
%DHCP-6-ADDRESS_ASSIGN: Interface FastEthernet0/1 assigned DHCP address 10.17.1.15, mask 255.255.255.0, hostname Client

You might wonder how you could ever end up with an expired lease when there’s a working DHCP server on the network. It’s simple if your DHCP server runs on IOS: when you clear the DHCP bindings on a router running a DHCP server, it stops responding to lease extension requests (DHCPREQUEST packets) from unknown clients.

see 2 comments

Back from vacation

Some of you might have noticed that I was somewhat quieter than usual the last few days … and I had an excellent reason: finally I’ve managed to sneak a week of climbing vacations into my schedule. There’s a fantastic eco lodge (chalet) in a nice valley between Chamonix and Geneva and they’re offering combined climbing-yoga classes.

If you’re interested in one or the other, you simply have to try them … and in the meantime you can enjoy a few pictures I’ve taken during the week.

Add comment

Display the rejected BGP routes

Jernej sent me an interesting question: “does Cisco IOS have an equivalent to the Extremeware’s show bgp neighbor a.b.c.d rejected-routes command which displays all routes rejected by inbound filters?”

Short answer: it doesn’t.

Somewhat longer explanation

If you want to display routes rejected by an inbound BGP filter, you have to store every route you ever received from the BGP neighbor, increasing the memory consumption of your BGP process. You can do that in Cisco IOS if you configure neighbor a.b.c.d soft-reconfiguration in.

Configuring per-neighbor inbound soft reconfiguration significantly increases BGP memory consumption. On top of BGP routes inserted in the main BGP table, the BGP process has to store every route received from the BGP neighbor in a neighbor-specific table (two copies of the accepted routes are needed because an inbound route-map might have changed the route attributes).

Workaround

Use this procedure to find rejected routes:

  • Within the BGP process configuration, enter neighbor a.b.c.d soft-reconfiguration in (this command might clear the BGP session).
  • Populate the per-neighbor table with the clear ip bgp a.b.c.d soft in.
  • Display the routes received from the neighbor with the show ip bgp neighbor a.b.c.d received-routes.
  • Display the routes sent by the neighbor and accepted by the inbound BGP filters with the show ip bgp neighbor a.b.c.d routes.
  • Do a diff of the two printouts (if you’ll write a short Tclsh script that does that, please feel free to send it to me or submit it in the comments).
see 3 comments

Cisco IOS is not an authoritative DHCP server

Imagine you have to move your DHCP clients to a different range within the same IP subnet. Can you do it if you run the DHCP server on a router running Cisco IOS? Sure, there’s the ip dhcp excluded-address command.

Not so fast … the ip dhcp excluded-address command does not affect the existing bindings. Sounds weird? It is. You tell the router to avoid some of the IP addresses in the DHCP pool and it will happily extend the leases for those IP addresses, it just won’t allocate new IP addresses from the excluded range. To change the IP address assignment of the existing clients, you have to clear the DHCP bindings with the clear ip dhcp binding command.

OK, so you clear the bindings. The next time the clients try to extend the lease, their requests will be rejected. Wrong – Cisco IOS is not an authoritative DHCP server. It ignores DHCPREQUEST messages coming from unknown clients in a correct subnet.

Read the DHCP client address change article in the CT3 wiki

see 1 comments

What went wrong: SCTP

Someone really wants to hear my opinion on SCTP (RFC 4960); he’s added a “what about SCTP” comment to several Internet-related posts I wrote in the last weeks. So, here are my totally unqualified (I have no hands-on experience) thoughts about SCTP.

Let me reiterate: I’m taking a 30,000-foot perspective here and whatever I’m writing could be completely wrong. If that’s the case, please point out my mistakes in your comments.

From the distance, the protocol looks promising. It provides datagram (unreliable messages), reliable message (record) and stream transport. Even more important, each connection can run across multiple IP addresses on each endpoint, providing native support for scalable IP multihoming (where each multihomed host resides in multiple PA address blocks from various Service Providers).

Second-hand evidence points to the viability of SCTP: it’s used in complex real-life signaling applications (SS7-over-IP), it’s implemented in Cisco IOS and IOS uses it for a variety of its transport needs (including high-availability solutions and reliable export of Netflow data).

However, SCTP will not solve the current IP multihoming issues (unless we’ll experience a world-wide Internet crash first). Here are just a few non-technical reasons why (if you have links to more in-depth information, please add them in the comments):

  • It was designed by the wrong working group. SCTP was a byproduct of the SIGTRAN working group which was focused on transport of PSTN signaling over IP networks.
  • It was never properly promoted. The SIGTRAN working group solved their problem and moved on.
  • It’s not shipped with Windows, which is a major showstopper as most clients that would benefit from SCTP’s IP multihoming support run on Windows.
  • Although it’s bundled in recent Linux kernels, the support files are not included in out-of-the box Linux distributions. To get it on my Fedora 11 distribution, I had to install lksctp-tools.
  • You can get libraries, source code, kernel patches and even commercial implementations of SCTP for most operating systems, but in most cases you have to do some installation and integration work. This is a great option if you want to play with the protocol, but not if you want to deploy generic applications over it.
  • Since it’s rarely used, there’s no support for it in the networking equipment. You can’t even match SCTP by name with extended access lists in Cisco IOS (you have to use its numeric protocol number). Obviously you cannot perform matches on SCTP port numbers. Passing SCTP across a firewall is a nightmare, as there’s no stateful inspection of SCTP traffic.

However, by far the biggest showstopper to SCTP adoption is the lack of session layer in TCP/IP and the broken Socket API. If you want to use SCTP with the Socket API, you have to indicate the protocol to use in the socket call, which means that every application that would benefit from SCTP support must be changed, recompiled and tested. There is no way that you could take existing applications, add SCTP support in the operating system and have a better-performing Internet as the result.

see 15 comments

Disable flapping BGP neighbors

It looks like we’re bound to experience a widespread BGP failure once every few months. They all follow the same pattern:

  • A “somewhat” undertested BGP implementation starts advertising paths with “unexpected” set of attributes.
  • A specific downstream BGP implementation (and it could be a different implementation every time) a few hops down the road hiccups and sends a BGP notification message to its upstream neighbor.
  • BGP session must be reset following a notification message; the routes advertised over it are lost and withdrawn, causing widespread ripples across the Internet.
  • The offending session is reestablished seconds later and the same set of routes is sent again, causing the same failure and a session reset. If the session stays up long enough (because the routers have to exchange approximately 300,000 routes), some of the newly received routes might get propagated and will flap again when the session is reset.
  • The cyclical behavior continues until a manual intervention.

I don’t understand why Cisco IOS allows the cyclical BGP session behavior (as it’s pretty obvious things will not get better by themselves once the same session is reset a few times), but there’s at least something we can do with Embedded Event Manager: shut down the offending neighbor after seeing the BGP-3-NOTIFICATION syslog message often enough in a short period of time.

You’ll find the EEM applet that shuts down a flapping BGP neighbor in the CT3 wiki.

see 8 comments

Do not EVER run OSPF or IS-IS with your Internet customers

Clue started an interesting discussion on the NANOG mailing list. He’s inherited a network that extended its internal OSPF to its multihomed customers and wondered whether he should leave the network as it is, change OSPF to IS-IS or deploy BGP. Here are a few thoughts from my reply.

Please remember that we were discussing running global OSPF with the customer routers. Running OSPF in a VRF is a different story, as the customer cannot impact another customer’s routing (they can only burn your CPU cycles).

Do not ever run an SPF routing protocol (OSPF or IS-IS) with your customer. They can insert anything they want into it, be it due to configuration mistake, malicious intent or third-party hijacking, and your whole network (or at least the other customers) will be affected.

Just to give you a few examples:

  • They could hijack the host route to your DNS server and spoof every other customer of yours that uses your DNS (I haven’t seen this one yet, but it’s feasible).
  • They could hijack the host route to your POP3 server and collect the usernames and passwords of your residential users (I’ve seen this in a production network, but the attack vector was not OSPF but another routing protocol).
  • Company A could hijack the host route to the web server of company B.
  • They could insert a better default route than you do and at least some of your routers will listen to them (I’ve seen this done with OSPF).
  • If they ever make a total mess and start flapping their LSAs, your whole network will be affected and all your routers will burn CPU running SPF algorithm.

If you absolutely insist on not using BGP (but then BGP is the only currently available routing protocol designed to handle routing in scenarios where the two parties don't necessarily trust each other), use RIP. It's safer than OSPF; at least you can filter the incoming updates.

I’ve also seen a Service Provider running RIP with their customer … but they were not using any filters when redistributing RIP routes into their IGP.

Numerous other respondents shared my feelings and the best summary was provided by Steve Bertrand: “if in the same sentence you read ‘my network’ and ‘customer network’, use BGP.”

see 6 comments

What went wrong: the Socket API

You might think that the lack of a decent session layer in the TCP/IP protocol suite is the main culprit for our reliance on IP multihoming and related explosion of the IP routing tables. Unfortunately, we have an even bigger problem: the Berkeley Socket API, which is over 25 years old and used in almost all TCP/IP software implementations (including the high-level scripting languages like PERL).

read more see 18 comments

Turn a switch into a hub … the Microsoft way

If you’ve ever tried to get advanced Cisco certifications, you’ve probably encountered questions dealing with the mismatch between the end device ARP timeouts and the L2 switch CAM (MAC address cache) timeouts. If you’re still wondering what the underlying problem is (it took me a while to figure it out), read the Unicast Flooding in Switched Campus Networks document from Cisco.

In all scenarios, traffic sent to unknown unicast MAC address causes layer-2 flooding, which can significantly reduce switch performance. Microsoft took this problem to a completely new level with its Network Load Balancing implementation: Windows servers send ARP replies containing MAC address X from MAC address Y, causing all the traffic toward the servers to be flooded – effectively turning an Ethernet switch into a hub.

see 3 comments

Regular expressions in EEM applets

EEM 3.0 (available in IOS release 12.4(22)T) includes support for variables, programming logic and regular expressions. Not surprisingly, the IOS documentation of these exciting features is a bit sparse, so I’ve decided to write a step-by-step EEM regular expression tutorial.

Read the Regular expressions in Embedded Event Manager applets article in the CT3 wiki

Add comment

Questions on BGP AS-path filters in non-transit networks

I’ve sent a link to my “Filter excessively prepended AS-paths” wiki article as an answer to a BGP route-map question to the NANOG mailing list and got several interesting questions from Dylan a few hours later. As they are pretty common, you might be interested in them as well.

In my environment, we are not doing full routes. We have partial routes from AS X and then fail to AS Y. Is their any advantage for someone like me to do this, as we are not providing any IP transit so we are not passing the route table to anyone else?

You could use inbound AS-path filters to protect your BGP table, but as you're not a transit AS, it does not make much sense. You might also use them to reduce the size of your BGP table, but obviously you’ve already taken care of that.

When I run the "sh ip bgp quote-regexp "_([0-9]+)_\1_\1_\1_\1_ \1_" | begin Network" I am seeing many paths that would be filtered by this access-list.

Obviously a lot of people are prepend-happy.

As I’ve stated in the original article, if prepending your AS number a few times doesn’t help, nothing will.

What happens to those networks, are they unreachable from my AS, or do I just route those networks to my upstream provider and let them deal with it?

If I understood correctly, you're using a default route toward AS Y, which means that anything that is not in your BGP table (and consequently your IP routing table) will be sent out of your AS via the default route. If you're getting the paths you're filtering from AS X that means that more traffic will go to AS Y.

This last question is a little off-topic, but relates to your access list. In the event of some kind of DOS attack coming from one of a few AS numbers (let's assume it's AS Z), what is the feasibility of using

ip as-path access-list 100 deny _([0-9]+)_\1_\1_\1_\1_
ip as-path access-list 100 deny Z
ip as-path access-list 100 permit .*

Would this have any affect at all, or would my pipe from my upstream still be congested with garbage traffic?

No. You cannot influence the inbound traffic apart from not advertising some of your prefixes to some of your neighbors or giving them hints with BGP communities or AS-path prepending. Whatever you do with BGP on your routers influences only the paths the outbound traffic is taking. What you'd actually need is remote-triggered black hole. Frank Bulk provided a number of links in this message.

see 2 comments

What went wrong: TCP/IP lacks a session layer

One of the biggest challenges facing the Internet core today is the explosion of the IP routing and forwarding tables, which is caused primarily by traffic engineering and multihoming requirements. Things were supposed to get better when IPv6 introduced strict hierarchical addressing (similar to the phone number addressing, where the first few digits always denote the country code). Unfortunately, the hierarchical IPv6 addressing idea relied on incredible belief that the world will shape itself according to the wills of the IETF workgroup members. Not surprisingly, that didn’t happen and the hierarchical IPv6 addressing idea was quietly scrapped, giving us plenty more prefixes to play with when trying to pollute the global IPv6 routing tables.

If we take a step back and ask ourselves: “why do we need IP multihoming in the first place?” the answer is simple: because the client application (layer-7 entity) connects to the IP address (layer-3 entity) of the server. If we want to have persistent sessions, the server IP address must remain reachable – the headaches are caused by tight coupling across numerous protocol layers.

The session persistence, session restarts, checkpoints and the ability to connect to multiple L3 addresses to reach the same service are the jobs of the OSI session layer … only there is none in the TCP/IP protocol stack. Twenty years ago the IP zealots were quick to point out that “if you know what you’re doing, four layers are enough, if you don’t, even seven layers won’t help you”. Guess who’s laughing now (although the laugh is somewhat bitter and utterly irrelevant).

The worst part of the story is that the IETF community had the chance to fix their design problems when they realized that IPv4 will have to be replaced by something else. Unfortunately, the only thing they could agree upon was to replace IPv4 addresses with longer IPv6 addresses while retaining the rest of the (then) 15-year old design. Some people realized years later that the whole approach is broken and tried to fix it with SHIM6, but it was too little done way too late. The SHIM6 protocols are still in drafts, there are no commercial implementations and we’re already running out of IPv4 addresses.

see 10 comments

IOS access list numbering scheme

Shane sent me a really interesting question: he was wondering why there's such a huge gap between the (numbered) extended IP ACL (100 – 199) and the extended range of standard IP ACL (1300 – 1999).

Some of you might be old enough to know that Cisco IOS supports (or used to support) around 10 different layer-3 protocols (IP being the most popular these days) and each one of them (if it was added to IOS early enough when the parser was still somewhat immature) required its own range of numbered ACL. I’ve summarized all of them in the “IOS Access List numbering scheme” article in the CT3 wiki.

Continue reading the article …

see 3 comments

Broadband traffic management

The discussions following my “All-I-can-eat mentality” post have helped me get a much better understanding of the broadband access business issues. I’ve already shared some of them in a follow-up post. A few weeks later (just before leaving for my summer vacation) I’ve tried to provide as balanced perspective as I could manage in the “Broadband traffic management: Finding rational solutions to ease congestion” article I wrote for SearchTelecom.

Read the SearchTelecom article

Add comment
Sidebar