Intelligent Redundant Framework (IRF) – Stacking as usual

When I was listening to the Intelligent Redundant Framework (IRF) presentation from HP during the Tech Field Day 2010 and read the HP/H3C IRF 2.0 whitepaper afterwards, IRF looked like a technology sent straight from Data Center heavens: you could build a single unified fabric with optimal L2 and L3 forwarding that spans the whole data center (I was somewhat skeptical about their multi-DC vision) and behaves like a single managed entity.

No wonder I started drawing the following highly optimistic diagram when I was updating my Data Center 3.0 webinar, which now includes information on Multi-Chassis Link Aggregation (MLAG) technologies from numerous vendors (as always, attendees of past webinars get free access to updated materials and recordings).

However, the worm of doubt was continuously nagging somewhere deep in my subconsciousness, so I decided to check the configuration guides of various HP switches (kudos to HP for the free and unrestricted access to their very good documentation). I selected a core switch (S12508) and an access-layer switch (S5820X-28S) from the HP IRF technology page, downloaded their configuration guides and studied the IRF chapters. What a disappointment:

  • Only devices of the same series can form an IRF. How is that different from any other stackable switch vendor?
  • Only two core switches can form an IRF. How is that different from Cisco’s VSS or Juniper’s XRE200?
  • One device in the IRF is the master, others are slaves. Same as Cisco’s VSS.
  • Numerous stackable switches can form an IRF. Everyone else is calling that a stack.
  • IRF partition is detected through proprietarily modified LACP or BFD. Same as Cisco’s VSS.
  • After IRF partition, the loser devices block their ports. The white paper is curiously mum about the consequences of IRF partition. No wonder, IRF does the same thing as any other vendor – the losing part of the cluster blocks its ports following a partition.

There might be something novel in the IRF technology that I’ve missed and that truly sets it apart from other vendors' solutions. If that’s the case, please chime in with your comments. For the moment, IRF looks like stacking-as-usual to me.

More information

Numerous MLAG technologies, including Cisco’s VSS and vPC, Juniper’s XRE200 and HP’s IRF are described in my Data Center 3.0 for Networking Engineers webinar (buy a recording or yearly subscription).

15 comments:

  1. Chris.young@HP.com24 January, 2011 13:10

    Hi Ivan,

    There are afew small things which do differentiate IRF.

    1) irf in a 2-chassis "stack" shares state across all 4management modules. Unlike the rpr-warm in the VSS solution. Cisco's approach is to reboot the chassis if the in-chassis master fails. HP just drops to half speed ( which is still usually faster than Cisco's full speed!)
    2) IRF does require specific hardware within the same family, but that's about were it ends. No restriction in which series of line cards(67xx) only. No situations were the line cards will not be given power. And we actually have the ability to have PoE AND IRF in e same chassis.

    3) Consistency across the portfolio. Operationally, this means we have consistent bevahiour at each tier oath network. Compared to the various options available (stack wise, VSS, vpc, fabric path,etc...) that fill some of the functionality of IRF.

    4) IRF is hardened in the field. IRF is based on 3coms XRN from the late 90's. Withoutnletting the cat out of the bag; this means that we have already solved a lot of the problems that go along with this type of technology.

    Hope this helps,

    Chris

    ReplyDelete
  2. Thanks for the comment. Absolutely valid points!

    Need to check #1 though - my understanding is that the RSP in the second chassis takes over immediately ... but you're right, the secondary RSP (if present) needs to be reloaded.

    ReplyDelete
  3. Thanks for the comment. Absolutely valid points!

    Need to check #1 though - my understanding is that the RSP in the second chassis takes over immediately ... but you're right, the secondary RSP (if present) needs to be reloaded.

    ReplyDelete
  4. Chris.young@HP.com26 January, 2011 01:02

    Hey ivan,

    You're right on that. Sorry if i wasn't clear, the failover to the master chassis takes about 400ms from my reading (compared to approx 50ms with IRF). My point was the in-chassis failover is abysmal. Reboot the whole chassis? In this scenario there's also no guidance as to the length of time that the VSS pair will take to reconvert because of the variance in time it may take for the cat6k to reboot depending on which modules are in the box.

    ReplyDelete
  5. Doesn't VSS avoid the reload scenario by simply failing back to the active supervisor? I need to check figures for failover to the master chassis, but I think the failover time was fairly switft using MECs up and downstream.

    P.S: I am currently testing an enterprise level network. This would be a good test to try.

    ReplyDelete
  6. If you have two SUP modules in a single chassis (4 SUPs per VSS), the secondary SUP in the master chassis stays dormant. If the primary SUP in the master chassis fails, the whole chassis has to reboot (including going through the power-up tests) before the secondary SUP can take over (or at least that's my understanding of the documentation - could be way wrong).

    ReplyDelete
  7. Chris - could you please describe your software upgrade process when pair of A12500 using IRF? From what I understand you have to have same code for both chassis in order to run IRF? so you can't upgrade your core without downtime?

    ReplyDelete
  8. In general, an IRF stack can be upgraded while maintaining service if the proper upgrade procedure is followed. With the proper split brain (which they call Multiple Active Detection - MAD) in place, 1 unit can be reloaded with new code, a split brain can be forced and the MAD will reload the other node (after the first unit has come online). It basically relies on the remote link-aggregation link failover for the in-service upgrade. Based on tests I have done, you have about 2-3 seconds failover time (tested using fping between end stations) for the full firmware upgrade using this procedure. This procedure currently only applies to a 2-node stack however, so if you need this 'always on' functionality it seems best to stick to 2 members in the stack at present.

    ReplyDelete
  9. Hi Ivan,

    You are right that IRF is similar in function as VSS. For me the key difference would be that VSS is platform restricted (65xx), while the same distributed forwarding technology (each irf member can perform full local forwarding, no need to consult master) is available in the form of IRF on basically 'all' switches in the H3C portfolio, from the low-end to the high-end.
    This means for instance that the 5800 (top of rack switch) in a stack has distributed L2 forwarding, distributed L3 IPv4 forwarding, distributed L3 IPv6 forwarding and distributed MPLS/VPLS forwarding, traffic never needs to pass/consult the master for the actual traffic forwarding.
    I do not know if it will be useful, but I heard that IRF support for 4 chassis switches will be released in the near future.

    ReplyDelete
  10. Hi Ivan,
    I read IRF is Active-Active mode, and much more performed during failover compared with VSS. The latency is better than VSS or vPC

    ReplyDelete
  11. Not sure if HP has fixed this, but last year I saw data on several lab demonstrations where it was proven that IRF does not support QoS between switches in the stack. This is HORRIBLE for voice deployments because you have to architect your network to avoid sending voice/video traffic across IRF links. To me this is unforgivable in a switching architecture .. you may as well not even bother and just tell everyone to use 10GbE links to stack your switches, because at least then you can use QoS. But wait, QoS priority queue doesn't work on a lot of E-Series switches either :)

    ReplyDelete
    Replies
    1. Interesting finding. Do you have any documentation about those labs?

      Delete
  12. Does IRF allow for multihoming at the Edge? E.g. a vanilla 802.3ad capable switch connecting to two distinct 'leafs' part of an IRF?

    ReplyDelete
    Replies
    1. Yes. However, keep in mind that IRF (and all other stacking solutions) work best if you connect the outside switch (or LACP-capable server) to every member of the stack.

      Delete
  13. Hi
    Im afraid that you didn't understad so well the guides you read it. First, obviously, any vendor can even make an stack with other vendors, and HP either. Second, there is not a master slave usage, If the "master" switch failed, the IRF array stills up, there is not a dependency. Third, Imagine IRF and Cisco VSS are the same thing, but HP IRF is available since $2,000 dollars switches, Cisco is requiring the Nexus models or Catalys 6500, how much do these switches cost?

    ReplyDelete

You don't have to log in to post a comment, but please do provide your real name/URL. Anonymous comments might get deleted.

Ivan Pepelnjak, CCIE#1354, is the chief technology advisor for NIL Data Communications. He has been designing and implementing large-scale data communications networks as well as teaching and writing books about advanced technologies since 1990. See his full profile, contact him or follow @ioshints on Twitter.