You don’t need OpenFlow to solve every age-old problem

I read two great blog posts on Sunday: evergreen Fallacies of Distributed Computing from Bob Plankers and forward-looking Understanding Hadoop Clusters and the Network from Brad Hedlund. Read them both before continuing (they are both great reads) and try to figure out why I’m mentioning them in the same sentence (no, it’s not the fact that Hadoop uses distributed computing).

OK, here’s the quote that ties them together. While describing rack awareness Brad wrote:

What is NOT cool about Rack Awareness at this point is the manual work required to define it the first time, continually update it, and keep the information accurate. If the rack switch could auto-magically provide the Name Node with the list of Data Nodes it has, that would be cool. Or vice versa, if the Data Nodes could auto-magically tell the Name Node what switch they’re connected to, that would be cool too. Even more interesting would be a OpenFlow network, where the Name Node could query the OpenFlow controller about a Node’s location in the topology.

The “only” problem with Brad’s reasoning is that we already have the tools to do exactly what he’s looking for. The magic acronym is LLDP (802.1AB).

LLDP has been standardized years ago and is available on numerous platforms, including Catalyst and Nexus switches, and Linux operating system (for example, lldpad is part of the standard Fedora distribution). Not to mention that every DCB-compliant switch must support LLDP as the DCBX protocol uses LLDP to advertise DCB settings between adjacent nodes.

The LLDP MIB is standard and allows anyone with SNMP read access to discover the exact local LAN topology – the connected port names, adjacent nodes (and their names), and their management addresses (IPv4 or IPv6). The management addresses that should be present in LLDP advertisements can then be used to expand the topology discovery beyond the initial set of nodes (assuming your switches do include it in LLDP advertisement; for example, NX-OS does but Force10 doesn't).

Building the exact network topology from LLDP MIB is a very trivial exercise. Even a somewhat reasonable API is available (yeah, having an API returning a network topology graph would be even cooler). Mapping the Hadoop Data Nodes to TOR switches and Name Nodes can thus be done on existing gear using existing protocols ... or maybe someone already did it? Tell me in the comments.

Would OpenFlow bring anything to the table? Actually not, it also needs packets exchanged between adjacent devices to discover the topology and the easiest thing for OpenFlow controllers to use is ... ta-da ... LLDP ... oops, OFDP, because LLDP just wasn’t good enough. The “only” difference is that in the traditional network the devices would send LLDP packets themselves, whereas in the OpenFlow world the controller would use Packet-Out messages of the OpenFlow control session to send LLDP packets from individual controlled devices and wait for Packet-In messages from other device to discover which device received them.

The Linux configuration wouldn’t change much. If you want the switches to see the hosts, you still have to run LLDP (or OFDP or whatever you call it) daemon on the hosts.

Last but definitely not least, you could use well-defined SNMP protocol with a number of readily-available Linux or Windows libraries to read the LLDP results available in the SNMP MIB in the “old world” devices. I’m still waiting to see the high-level SDN/OpenFlow API; everything I’ve seen so far are OpenFlow virtualization attempts (multiple controllers accessing the same devices) and discussions indicating standard API isn’t necessarily a good idea. Really? Haven’t you learned anything from the database world?

So, why did I mention the two posts at the beginning of this article? Because Bob pointed out that “those who cannot remember the past are condemned to fulfill it.” At the moment, OpenFlow seems to fit the bill perfectly.

10 comments:

  1. Juniper EX-Series Switches supports LLDP too.

    ReplyDelete
  2. Alexander Hartmaier (abraxxa)13 September, 2011 07:03

    Great to see that someone outside the Perl community knows SNMP::Info!
    We recently switched to git and development it going ahead faster than ever before since.
    You might want to use metacpan these days for nicer links: https://metacpan.org/module/SNMP::Info::LLDP

    ReplyDelete
  3. Oh, it was easy - started with "there must be an LLDP API somewhere" ... "probably in PERL" ... "let's ask Google" ... "Ah, thought so ;)"

    Great job. Nice & concise API. You might want to consider adding lldpad support (not sure lldpad publishes its information in SNMP format).

    ReplyDelete
  4. Including "management address TLV"?

    ReplyDelete
  5. Also worth noting HP and extreme both do LLDP and support the management address TLV. Though with extreme the default is for LLDP not to send the address and it's an extra config command to enable it.

    ReplyDelete
  6. hi Ivan,

    a cursory glance at our EX4200 thats on 10.0 code appears to suppor the Management Address TLV:
    {master:1}[edit protocols lldp]
    admin@SWITCH# set ?
    …output truncated...
    management-address LLDP management address
    …output truncated...

    ReplyDelete
  7. From an initial peruse of Open Flow it appears to be just an Object Oriented language or API to the network's protocols. But then again just use IPv6 and code your flow label fields to provide the space for the same result ;)

    ReplyDelete
  8. LLDP will give you more information, but you really don't need to use it for this. The regular mac address table mibs will tell you what nodes are connected to what ports. Combined with the arp table you can map ip/mac to a switch port.

    snmp::info has modules for doing this as well, but I use netdisco(which uses snmp::info) which dumps everything into a sql database. This makes doing things like generating your network topology a piece of cake.

    ReplyDelete
  9. BTW Microsoft include LLDP driver and enable it by default in Windows8 at least on developer preview.

    ReplyDelete
  10. Windows 8 RTM has it disabled by default. :(

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.