Repost: LISP Is a False Economy

Minh Ha left this comment on the Packet Forwarding 101 blog post. As is usually the case, it’s fun reading and it would be a shame not to repost it as a standalone blog post (even though I don’t necessarily agree with all his conclusions).


I always enjoy Bela’s great insights, esp. on hardware and transport networks, but this time I beg to differ. LISP, is a false economy. It was twisted from the start, unscalable right from the get-go. In Networking and OS, to name (ID) something is to locate it, and vice versa. So the name LISP itself reflects a false distinction. Due to this misconception, LISP proponents are unable to establish the right boundary conditions, leading to the size of xTRs’ RIB diverging (going unbounded). In a word, it has come full circle back to BGP, an exemplary manifestation of RFC 1925 rule 6.

As always, misunderstanding the fundamentals leads to exploding complexity and dead ends, so LISP has problems with path liveness check and state synchronization as well1

All 3 problems severely limit scalability, so LISP essentially fails in its original goal. One can argue that LISP works fine for small networks, but small networks need no design. There, a brute-force sequential search (flat) method for routing is good enough, in a word, anything goes. It’s the big networks that need hierarchy, and LISP can’t enforce this hierarchy essential for scaling, because it can’t impose the right boundary conditions.

LISP is in a pretty similar situation to TCP congestion control, wherein people, due to a lack of understanding, naively think it can be solved by “careful tuning” of parameters. It cannot, because end-to-end CC is a dead end. So as long as you keep clinging to it, all you do is putting lipstick on a pig.

Just as the key with CC is understanding that end-to-end CC attempts to solve problem on the wrong layer, and so all end-to-end transport protocols, try as they may, are fundamentally incapable of resolving it, and what has to be done, is a renormalization of the CC’s length scale and layering, the key to understanding why LISP is unworkable on large scale, is realizing that people have been asking the wrong question. Routing explosion is an addressing problem; it has to be solved based on an understanding of how addressing should be structured.

As it stands, IP is missing more than half the structure, with IP and MAC redundantly naming the same thing (the interface). This, coupled with provider-based addressing, plus one global address space for the entire Internet, will always lead to unbounded RIB sizes, and routing update explosion. Topological addressing can deal with that, and when we think in terms of topology, new structures start to emerge, including the understanding that provider-independent address assignment, is the right way to go. Topological PI addressing will help set up the proper boundary conditions, and RIB size can go down by several times and potentially be bounded too.

Tony P’s comments on valley-free routing is essentially a description of topology-based (resilient) addressing, where distortion on the network graph has no impact on underlying addressing structure – it’s topologically protected.

Since this scaling metric is universal enough, the same solution applies equally to DC networks, which essentially are just one type of SP network. It can lead to better simplification and less pain.


  1. Dave Meyer identified numerous shortcomings of Locator/ID separation in early 2009. As is usually the case, nobody listened. ↩︎

4 comments:

  1. Let me be perfectly clear: LISP got half the picture right, as in it sees the need to ID the router by a node address, but it fails to go ball to the wall and identifies the endpoint by a node address as well. Routing on the node will remove the DFZ, cuts down the routing table size, slows down its growth, and makes multi homing and mobility quite a bit easier.

    Right now LISP is only done half-way, so I'm not in its camp, but if one day LISP can go all the way to routing on the node, then I'm all for it. I actually agree with some of the points in Victor Moreno's comment, esp this part: "the impact of mobility events in a BGP network is unbound." Yes, because the scope of BGP is the whole Internet. By routing on the interface address and having a flat address space whose length scale extends across the globe, this leads to complexity, tight coupling and RIB + Update explosion. Having one global address space whose visibility is the whole Internet, is essentially a form of centralization. LISP has the potential to change all this, but unfortunately, by far, it's still missing half the structure.

    Also, let's say our Internet addressing structure is now too deep-rooted to ask for renumbering into a topologically-protected addressing scheme, so it will stay like this forever. Even then, if we adopt routing on the node instead of interface, all of what I said above will still apply. Its effects will be to a lesser degree, but they're positive effects all the same: reduction in the global RIB, slow-down in the increase of RIB entries, massive reduction in routing update, much lighter-weight multihoming and mobility, and much less TCP session breakage due to interface IP change on either end. Adding anycast entries won't lead to nonlinear increase in the number of RIB entries anymore, AFAIK.

    Henk is right: the hard part in Computer Science is caching and naming things. If we can name the right things, we've gone a long way toward scalability. The OS guys got it right with Virtual Memory and High-Level Language. Imagine how badly current OSes scale if all they have is physical memory, or how tough it is to do programming when all you have is machine language.

    So yeah, all in all, LISP has the first part right, but it's missing the 2nd half of the picture. It has potential, but so far is not fully developed. If that can be improved, LISP can become the new routing architecture. BGP doesn't have to be thrown out then, it can still be incorporated as the routing protocol, but it'll be much simplified. AS for one, is no longer needed in it AFAIK.

    ALso, a lot of the techniques we've developed over the decades, are also as good for the new routing architecture as they are for the old ones, so they won't be obsolete either. But complexities like NAT are no longer needed. NAT after all, is just flattening of hierarchy, kinda like putting all VM to physical memory mappings in one flat table instead of using hierarchical page directory. When we have layered addressing, NAT can disappear.

    Also, TCP and IP should not have been split back in the 80s, as they're tightly coupled. Merge them back, and problems like fragmentation can be dealt with more easily. As for CC, think about why DCTCP reasonably succeeds in its environment. That's what I meant by length scale renormalization. Also, a network problem like CC is best resolved in the network layer, esp. when it's now well known that Internet traffic, or all traffic for that matter, experiences some form of fractal self-similarity.

    Replies
    1. Talking about naming and locating things, could you kindly provide your contact details such that we may connect?

    2. Hi Jeroen, Ivan has my contact, so you can grab that off him. Feel free to email me anytime :)) .

  2. My view of LISP is not a solution for the global public Internet. In this aspect I agree, that it has a a lot of issues. I see a potential in LISP as a private overlay replacing MPLS VPN solutions with something that is better in performance and has a built-in support for multi-link mobility. In that environment LISP has a much better performance and scalability then BGP. Just ask Victor Moreno about his experiements and the results. It was measured in a large scale test, it is not just a paper tiger. He has published the results and publicly available. I agree that in some aspect LISP is a kind of a next generation BGP when PubSub is used with reliable transport. It is best used in a flat logical topology like a private overlay. There it is a big advantage that there is no best path selection. So it is very fast for mobility. Actually, you can make a lot analogy between the current LISP usage and BGP. The LISP MS/MR is a kind of analogy to the BGP RR. It has similar design considerations for availability and scalability.

  3. Lovely comment ;-) In wider sense, basically no'one should be allowed to touch networking at scale until they read and internalized the Xerox Park Schoch's '78 seminal paper "Inter-Network Naming, Addressing and Routing" AFAIS ;-) But we are in the day and age where addresses being used as routes or sub-service access points is sold as the next "architecture". AFAIS we could just as well suggest to make the whole planet a flat cross-connect of wires with the whole ball of yarn being maintained by an infinitely wise, bonevolent and fast controller ;-) With the exception that it seems anybody with some smarts can inject traffic into the endpoint of any wire in such technology unless we have perfect filtering on every network edge in terms of cross-matrix of networks being able to talk to such address-route-service endpoints. This seems to pass as "security architecture" in that world ...

    Replies
    1. There is a solution for that too: In fully decentralized web3 style, require every packet to be added to a global blockchain (a "DLT network")

      Eliminates plausible deniability and creates a global networked economy based on personal reputation.

      Until someone figures out that this scheme doesn't scale - and so we apply it to every flow (TCP SYN packet) instead of every packet.

    2. Thanks for letting me know about "Inter-Network Naming, Addressing and Routing" (@ http://www.postel.org/ien/txt/ien19.txt for those who are looking for it)

  4. Radia Perlman's recent keynote "Do the wrong thing!" at NANOG84 is relevant in this context: https://www.youtube.com/watch?v=5D1v42nw25E

Add comment
Sidebar