The Future of Multicast and QoS
A. Friend sent me a long list of questions after listening to excellent Future of Networking podcast with Martin Casado because (as he said) he prefers “having a technical discussion with arguments and not just throwing statements out there.”
He started with “Martin's view seems to be that network is all plumbing and all the intelligence should be in the applications.”
Which I totally agree with. Complexity belongs to the network edge (ideally into the end-hosts) and every time we managed to move it from the network to the end-hosts, we got cheaper and faster networks (example: moving from X.28 to Telnet, and from FC to NFS/iSCSI).
They make statements that multicast and QoS are going away.
Also somewhat agree with. Not unconditionally, but yes (and their discussion was more nuanced than that anyway, so go and listen to it). Let’s go into details.
While I can agree with some statements, multicast is there for a reason.
There are a few well-known use cases for multicast (live 1-to-N video streaming, stock exchange feeds), and a few others where the developers wanted to push the problem of service discovery to someone else (yet again proving RFC 1925 section 2.6). Apart from that, multicast has been a zombie for decades.
They mention the application keeping track of who requested the stream instead. OK, so we moved the state to the application. That's fine and dandy when app is hosted on one device. Now our service is popular and needs to be hosted on multiple devices, who keeps state? How do we sync the state? Hasn't the amount of state increased compared to having it in the network? Not even taking concern of the bandwidth are wasting.
First, state is cheap when it’s implemented in low-speed software. Low-speed RAM is cheap, high-speed packet forwarding engines using TCAM (because a walk down a data structure would be too slow) is expensive. Moving state to low-speed components makes perfect sense.
Also, how many times have you seen endpoints requiring the exact same content, which could then be effectively replicated on-the-fly within the network (apart from the two use cases I mentioned above)?
On a somewhat tangential topic, creating unnecessary state and/or syncing state across multiple devices is best avoided if you want to scale your solution. You can start with something as simple as not keeping session state in local files on your web server (so you don’t have to use session stickiness on your load balancer), or go as far as Facebook did.
Regarding QoS, it's just a tool to differentiate traffic based on business needs. I can agree that throwing more bandwidth at things is often the best solution. However, that is not always an option and as long as we consider a certain type of traffic to be more important than other traffic we must have some form of QoS or queuing.
Generic widespread end-to-end QoS has been dead even before it was born ;) One would hope that we’ve learned that lesson from ATM. Read what Geoff Huston wrote on the topic of Internet-wide QoS, and listen to the Packet Pushers podcast with Douglas Comer.
In the data centers, it’s easier to throw more bandwidth at the problem, and solve the potential remaining 5% in the application stack. MP-TCP is one of the solutions addressing the hashing problems of elephant flows, as is MPIO. Other tricks like FlowBender modify hashing fields (IPv6 flow label or TTL) until the TCP session hits a non-congested link (they already have a patch for Linux kernel).
We’ll talk about these issues in the next Leaf-and-Spine Fabric Designs webinar sometimes in autumn.
As for QoS on WAN, the real question is “can you control the congestion point?”, and in most cases, the answer is NO (because the congestion happens in DSLAM or within the SP access/core network), which leaves the end-to-end congestion tracking offered by some SD-WAN vendors as the only alternative. Yes, it’s QoS (shaping + subsequent queuing) but it’s done at the very edge of the congestion domain, not everywhere in the network.
However, have you considered what happens once people stop watching TV and turn to Netflix? All those video streams will become unicast streams.
I still think there is a need for multicast and network engineers need to master it, it's a great technology where a lot of time not implemented correctly because network engineers are afraid of it.
However, it doesn't mean that the plumbing in the middle does not require BGP let alone routing protocol.
In this case, end-host is participating but not actually taking over.
I cannot find real-life example (probably I haven't live that long) on how multicast control can be pushed to end-host. Session control/record is required on each hop.
Idea is good, you both are futurist.
At a small scale (100 Mbps links, Cisco ISR routers) I'm sure multicast works fine. At a large scale (10Gbps links, Cisco 6500/7600 routers), where multicast should be the most beneficial, multicast is a troubleshooting nightmare.
As for QoS, even in long-haul networks there can be bottlenecks requiring QoS. There is also a movement towards more agile optical restoration to replace L3 capacity in long-haul/metro networks. During the restoration events you need QoS in order to keep higher priority traffic flowing while best effort traffic is dropped.
https://uqcs.org.au/2014/03/23/seminar-pervade-and-hypervade.html
On the DC side, for applications that are high volume and require efficient delivery, multicast will still play a key role. Perfect example is high volume market data distribution. On the WAN side, the use cases are mainly for video streaming (e.g. IPTV), broadcast, or applications that require fair delivery to recipients.
More and more applications are now relying on unicast delivery and doing multicast in a much localized fashion. Between the application layer and the network layer, it's getting more common to have a transport layer which is responsible for delivering data to the end points. This layer running in the servers can provide more flexibility and customization of data delivery. And that could also simply be a message bus in a pub-sub fashion.
Multicast has always been difficult to conquer. If you are involved, for example, in deploying multicast to MPLS WAN, then you should remember the fun debate related to NG-MVPN, mLDP VS RSVP-TE, etc. Why bothers if you don't have to? While multicast won't die, its future is dim. And this is not necessary a bad thing.