The tale of the three MTUs

An IOS device configured for IP+MPLS routing uses three different Maximum Transmission Unit (MTU) values:

  • The hardware MTU configured with the mtu interface configuration command
  • The IP MTU configured with the ip mtu interface configuration command
  • The MPLS MTU configured with the mpls mtu interface configuration command

The hardware MTU specifies the maximum packet length the interface can support … or at least that's the theory behind it. In reality, longer packets can be sent (assuming the hardware interface chipset doesn't complain); therefore you can configure MPLS MTU to be larger than the interface MTU and still have a working network. Oversized packets might not be received correctly if the interface uses fixed-length buffers; platforms with scatter/gather architecture (also called particle buffers) usually survive incoming oversized packets.

IP MTU is used to determine whether a non-labeled IP packet forwarded through an interface has to be fragmented (the IP MTU has no impact on labeled IP packets). It has to be lower or equal to hardware MTU (and this limitation is enforced). If it equals the HW MTU, its value does not appear in the running configuration and it tracks the changes in HW MTU. For example, if you configure ip mtu 1300 on a Serial interface, it will appear in the running configuration as long as the hardware MTU is not equal to 1300 (and will not change as the HW MTU changes). However, as soon as the mtu 1300 is configured, the ip mtu 1300 command disappears from the configuration and the IP MTU yet again tracks the HW MTU.

The MPLS MTU determines the maximum size of a labeled IP packet (MPLS shim header + IP payload size). If the overall length of the labeled packet (including the shim header) is greater than the MPLS MTU, the packet is fragmented. The MPLS MTU can be greater than the HW MTU assuming the hardware architecture and interface chipset support that (and the router will warn you that you might be getting into trouble). Similar to the ip mtu command, the mpls mtu command will only appear in the running configuration if the MPLS MTU is different from the HW MTU. However, contrary to the behavior of the IP MTU, any change in HW MTU with the mtu configuration command also resets the MPLS MTU to HW MTU.

The behavior as described above was tested on a 3725 router running IOS release 12.4(15)T1. Although the MPLS MTU Command Changes document claims that you cannot set MPLS MTU larger than then interface MTU from IOS release 12.4(11)T, I was still able to do it in 12.4(15)T1.

11 comments:

  1. I think one of the most tricky MTU issues are when you have GRE tunnel with or without IPSec. This document from Cisco documented the potential issues and how you can workaround them.

    http://www.cisco.com/en/US/tech/tk827/tk369/technologies_white_paper09186a00800d6979.shtml

    With businesses going global one can expect to see more IPSec+GRE tunnels to be deployed everywhere.

    ReplyDelete
  2. ... and that's where my investigation into various MTU issues originally started, only I had a scenario running MPLS/VPN across GRE/IPSec.

    ReplyDelete
  3. I found that the tcp mss adjust is a must-use command to avoid fragmentation and reassembly by the routers if IPSec and/or GRE is involved. The only time where I must set the IP MTU to 1500 inside a GRE tunnel was that my customer was having a home-grown apps that needed to see the entire 1500-byte packet intact.

    ReplyDelete
  4. There is some additional interesting information that I summarised to the cisco nsp list last year with regards to h/w mtu on cisco 7200 (with PA-FE DEC chipsets), mpls MTU and general MTU requirements for AToM circuits from point of view of CE->PE and PE->P links.

    Archive here:

    http://puck.nether.net/pipermail/cisco-nsp/2006-June/031765.html

    ReplyDelete
  5. I have had a major issue when I tried to change the IP mtu size of a router running IPSec tunnels. The router (cisco 2851 with 12.4(15).T4 changed the interface MTU also to 1400 (with my IP MTU 1400 command). With a firewall in between blocking ICMP, PMTUD also broke. Is this is known "feature"? I know that changing the MTU interface value would automatically change the IP MTU but not the other way around. ???

    ReplyDelete
  6. @Nasir: This is weird. Changing the IP MTU should not change the interface MTU. BTW, you might want to read The never-ending story of IP fragmentation article.

    ReplyDelete
  7. Hi,

    I'm till confused about the types of MTUs

    Could someone please elaborate a bit more on this

    1- IP MTU is counted from where in the Packet ?
    2- Does the interface MTU include the header of the layer 2 ?
    if not is the IP MTU equal to the interface MTU ? If so why do we need the change any of those values ?
    3- How do i set the MPLS MTU ? if possible , kindly provide an example


    thx

    ReplyDelete
  8. This was very helpful, but still not entirely sure what MTU I should be using in this situation.. I am configuring a pair of interfaces for EoMPLS. So in this case the payload is an ethernet packet, and the 'outer' ethernet frame has two MPLS labels. If my IP MTU is normally 1500, I would be encapsulating a 1514 byte frame in a 14+4+4 byte header, giving (I think) 1536 bytes on the wire. What are the correct MTU commands for this scenario?

    ReplyDelete
  9. Diego Zamberlan10 May, 2011 16:53

    Thank you taking the time to share your knowledge and experience.

    ReplyDelete
  10. A great post. I agree that the tcp mss adjust command is particularly useful at sorting out MTU issues. Having spent a few years running MPLS I'd also recommend setting an interface to 802.1q on FE interfaces to squeeze and extra four bytes of overhead out of the ethernet frame. Most Cisco FE ports are fixed at 1500 bytes so it's a usual workaround if you want to run MPLS on an FE port. If in addition you want to run MPLS VPNs on top of this (again on an FE port) you will need an additional four bytes, so you'll need to set the MPLS MTU to 1504 (or above). The problem with this is your effective interface MTU now becomes 1496 - meaning you will need good old tcp mss adjust for general traffic. Less of a problem these days with GE ports and the ability to set interface MTU over 1500 bytes.

    ReplyDelete
  11. Hi. We've been troubleshooting some file transfer performance issues in our network and have narrowed it down, using various testing, to what we believe is our MPLS L3 links between our data center Core and Distribution switches. Currently, the core has 2 10G links, each a /30 L3 link with MPLS on it and an MTU size of 1508. Looking at the stats, we see 100's of millions of giants on the switch interfaces, these are 6509's with SUP-720 3B's by the way. 8 port 10G blades. All 6509's, cards and IOS match between cores and dists. Application traffic is not affected, but any type of file transfer does appear to be. We are thinking that due to the 1508 MTU size for MPLS (no specific ip or MPLS MTU settiings, just HW), that it's creating performance issues for these transfers, for example, winscp (tcp 22), netapp, oracle, cifs, etc. Anyone run into this or have any recommendations?

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.