Use VRFs to Solve Routing-on-Hosts Challenges

One of my readers sent me interesting feedback after reading my explanation of why I’d try not to use OSPF as a routing protocol between hosts and ToR switches. He said:

Unfortunately we can’t use BGP because IBM mainframes support only OSPF or RIP, so we decided to use VRFs instead.

Here’s what they did:

  • They run data center network infrastructure in global routing table and all customer services in VRFs;
  • Mainframe is in a separate VRF;
  • ToR switches run OSPF with the mainframe and advertise default route to the mainframe;
  • Routes collected in the mainframe VRF are imported into other VRFs (alternative: exported with proper route targets) using strict prefix lists and route maps.

End result:

  • Misconfigured OSPF routing on the mainframe doesn’t impact any other device in the network (apart from CPU on ToR switches);
  • Even if the mainframe becomes a transit router, no traffic ever passes through it (because the transit routes are not leaked into other VRFs);
  • Whatever routes the mainframe announces is irrelevant to anyone else – they get installed into the mainframe VRF and only the expected subset is leaked into other VRFs.

You would get similar results by running a separate OSPF process with the mainframe and redistributing routes from that process into the core routing protocol (be it BGP or OSPF), but as you’d be using a single routing table the incorrect prefixes advertised by the mainframe could still impact the packet forwarding for all devices connected to the ToR switch (unless, of course, the ToR switch supports filters between OSPF SPF results and RIB/FIB like Cisco IOS does with the distance 255 command).

Interested in this solution but having no idea what I’m talking about or where to start? Watch the Enterprise MPLS/VPN webinar; I’m also available for short consulting sessions (that you can now bundle with the subscription to make it easier to get an approval from your boss).

3 comments:

  1. Actually you could do that without VRFs too. Just make an OSPF stub area for customer services with strict filtering on ABRs.

    You can't filter router/network LSAs in one area, but you can filter the translation from router/network LSA to a summary LSA on ABR before sending them to another area.
  2. It seems Mainframes are more common than I once thought. Having to deal with the same situation myself, we moved away from OSPF to a much simpler advertised-static-routes solution. Simplicity tends to be good for everyone!

    We realized that the Mainframe "internal" network only needed to advertise about 20 routes - which could be summarized into even fewer routes, and new routes barely need to be added even once a year. I worked with the Mainframe team to setup static routes instead of using OSPF, and advertised the few static routes via OSPF/redistribution myself for routes which needed reachability. I'd take static route over OSPF-with-Mainframe any day! :)

    Redundancy is often cited as reason for using dynamic routing (OSPF or RIP), however, we used HSRP VIPs for network level redundancy, which was tested successfully.

    (I don't know why, but, I was told that many Mainframe "internal" routes don't actually need reachability outside of Mainframe. They just tend to get advertised due to some poor best practices mentioned in the IBM guides. This further allowed us to control exactly what routes were advertised potentially leading to better security posture.)
  3. I often do routing to "customers" with a separate OSPF process and IOS distribute-list command, which filters OSPF to RIB/FIB.

    In IOS "redistribute" starts from the RIB (not from the source protocol), so this is enough for not accepting rogue prefixes and not distributing them as well.

    Of course we are still prone to DoS by memory consumption, and separate OSPF process is definitely not scalable on some older platforms.
Add comment
Sidebar