Build the Next-Generation Data Center
6 week online course starting in spring 2017

What Happened to “Be Conservative in What You Do”?

A comment by Pieter E. Smit on my vSphere Does Not Need LAG Bandaids post opened yet another can of worms: vSphere behavior on uplink recovery.

Short summary: vSphere starts using an uplink as soon as its physical layer becomes operational, which might happen during ToR switch startup phase, or before a ToR switch port enters forwarding state.

Many devices verify higher-layer availability before they start to use a point-to-point physical network link. These mechanisms could operate at layer-2 (example: LACP or UDLD) or above it – from DHCP-based address allocation to blocking - listening - learning - forwarding state transition of STP and explicit adjacencies formed by most modern routing protocols. Using a link just because your hardware detects a carrier signal is often too risky, particularly for a device that claims to be a switch with tens of servers behind it.

Obviously different rules apply to x86-based networking (or at least some simplistic variants of it) – you’re free to use a link for user traffic as soon as you can start sending packets, and when users start complaining, they get a recommendation to use manual failback and change switch configuration. One must wonder when the be conservative in what you do principle got lost.

I’m picking on VMware, but I’m positive other operating systems aren’t much better. If they are, please write a comment.

VMware finally implemented a reasonable mechanism in ESXi 5.0 – a configurable link-up delay. Too bad its default value is too low, it’s hidden somewhere in the Advanced Settings tab, and the only documentation is an arcane knowledge base article. Having an active CFM probe between uplinks belonging to the same port groups (not constant beaconing, just a link-ready-for-forwarding test) would be even better, but it’s probably too much to hope for … or not?

3 comments:

  1. I think that you are referring to Jon Postel's Internet engineering maxim, which is:

    "an implementation should be conservative in its sending behavior, and liberal in its receiving behavior" (reworded in RFC 1122 as "Be liberal in what you accept, and conservative in what you send"). https://en.wikipedia.org/wiki/Jon_Postel

    You are changing the quote a bit, with "be conservative in what you do." However, that's still good advice.

    ReplyDelete
    Replies
    1. That's exactly where I started ;) ... and yes, I changed it a bit, but as I'm discussing sending packets over a link, it's functionally equivalent to the original :D

      Delete
  2. Canonical links for VMware KB articles use shorter and simpler syntax:

    http://kb.vmware.com/kb/2014075

    Behind the scenes that triggers a URL rewrite/redirect to the long form of the URL which you quoted above.

    -VirtualJMills ... the one who put the kb.../kb proxy in originally :-)

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.