GRE tunnel keepalives

The IP-over-IP (usually GRE) tunnels (commonly in combination with IPSec to provide security) are frequently used when you want to transport private IP traffic over public IP network that does not support layer 3 VPNs. If you use the GRE tunnels in combination with default routing (or route summarization), you can get serious routing issues when the tunnel destination disappears, but a default (or summary) route in the IP routing table still covers it. You could work around this issue by deploying a routing protocol over the GRE tunnel (which could lead to hard to diagnose routing loops if you're not careful) or by using GRE keepalives introduced in IOS release 12.2(8)T.

The implementation of the GRE keepalives is amazing: the router sending the keepalive packet constructs a GRE packet that would be sent from the remote end back to itself (effectively building a GRE reply), sets the GRE protocol type to zero (to indicate the keepalive packet) and sends the whole packet through the tunnel (effectively encapsulating GRE reply into another GRE envelope). The receiving router strips the GRE envelope and routes the inside packet … which is the properly formatted GRE keepalive reply.

This trick allows you to implement different GRE keepalive timers on each end of the link. For example, the remote site might use fast keepalive timers to detect loss of primary link and switch over to a backup link, while the central site would use less frequent keepalive tests to detect failed remote site (if there is a single path to the remote site, you don't care too much when you detect it's down).

Every ingenious solution has its drawbacks and this one is no exception: if the receiving router protects its IP addresses (to stop spoofing attacks), it will drop the incoming GRE keepalive packet. Furthermore, a document available on Cisco's web describes the issues of using GRE keepalives in IPSec environment.

12 comments:

  1. A good use of GRE keepalive is to monitor a metro ethernet link between two routers. You setup a GRE tunnel with keepalive between two Ethernet endpoints to monitor true end-to-end connectivity over the metro Ethernet link. Keep in mind though you are not sending user traffic through the GRE tunnel, merely you are using the GRE keepalive as a health indicator of the metro Ethernet connection. Of course this will not be needed once Ethernet OAM, E-LMI, etc, have become widely available, but for the time being I find the GRE keepalive has other good uses besides tunneling traffic.
  2. That's definitely an interesting suggestion. But when you know that the end-to-end link is down, what do you do with that information? I have a few crazy ideas, but would like to hear from you first.
  3. We use standard NMS (HP Opeview, CA Spectrum, etc) to monitor customer devices. GRE Tunnel itself is just an other interface to these NMS systems, therefore if the tunnel went down the interface would become RED and an alarm will be triggered. Without this "indicator" tunnel interface we will have no way of knowing that the end-to-end path was actually down somewhere along the path. We have thought about using traps or monitoring routing neighbors logging, etc, but nothing beats the reliable tunnel interface Up/Down. This method has allowed us to open ticket proactively with the Metro Ethernet provider to resolve the issue. Keep in mind that the physical Ethernet interface itself could be UP/UP on the customer router, which isn't a reliable indicator.
  4. Hi ... very nice point. But, how to identify where's the problem path, when we find the Tunnel is flap, but all interface along the circuit is up (never down). Thanks !
  5. The only tool that comes to my mind is the "traceroute" command.
  6. Hi Ivan. The problem is the circuit is L2 based, and this circuit consist of many physical hop. I have checked all log and there's no problem with physical log. Thks.
  7. If you have L2 devices in the path that you don't control, there's no way to figure out where the problem is (in a few years, you might be able to use Ethernet OAM :).
  8. Not to mention VRFs where the keepalive is inside the VRF and not in the transit VRF (or default table if that be the case)
  9. Thanks, Ivan, for the good explanation. Using an inbound ACL on the tunnel interface can defeat the keepalive if the tunneled packet is GRE and you only allow IP. Also, troubleshooting is more difficult since the tunnel interface is up/up with the ACL. For example ...

    Side A with keepalives and ACL ... up/up.
    Side B with keepalives and no ACL ... up/down.

    Neither side can ping the remote end. Remove the keepalives and everything works.

    In response to William using GRE to monitor Metro Ethernet, you could use BFD which will take down the Ethernet interface if the end-to-end connectivity fails. Thanks, Tom.
  10. i have a similar problem. the GRE tunnel was working fine until the internet link failed. when it came back the GRE tunnel does not established. what could be the problem
  11. And along came IOS XE 3.1.0S which says in the release notes:

    http://www.cisco.com/c/en/us/td/docs/ios/ios_xe/3/release/notes/asr1k_rn_3s_rel_notes/asr1k_rn_3s_restrictions.html#wp3021511

    "GRE Keepalive with Tunnel Protection

    The Cisco ASR 1000 Series Router supports GRE keepalive with tunnel protection. However, the keepalive packet that is returned is not encrypted. "

    A friendly TAC engineer undug this from the ASR1K release notes, when I opened a case for a pair of ISR 4451-X (3.10.01S) which surprisingly worked very well with GRE keepalives on a protected GRE-o-IPSec tunnel.
Add comment
Sidebar