You might think that the lack of a decent session layer in the TCP/IP protocol suite is the main culprit for our reliance on IP multihoming and related explosion of the IP routing tables. Unfortunately, we have an even bigger problem: the Berkeley Socket API, which is over 25 years old and used in almost all TCP/IP software implementations (including the high-level scripting languages like PERL).
Update 2016-07-08: gethostbyname is obsolete. Also added a reference to happy eyeballs which got popular after this blog post was written.
To establish a client-to-server connection using Socket API you have to perform these calls:
- Create a socket with the socket() call
- Convert a hostname into a L3 address (IPv4 or IPv6) with the getaddrinfo() or (obsolete) gethostbyname() call.
- Connect to the remote L3 address with the connect() call.
The set of calls you have to perform is not surprising; Socket API is older than DNS. However, the reliance on L3 addresses passed around inside the application and a total disconnect between name resolution and session establishment is a disaster.
Just to give you an example: you might have a server farm offering a service (for example, scs.msg.yahoo.com or www.X.google.com) properly set up in DNS with numerous A records for the same name. However, most of the applications will perform the getaddrinfo() call which returns the list of addresses (regardless of whether they are reachable or not) and the first address (or all of them in sequence) is then usually passed to the connect() call (happy eyeballs implementations are an obvious exception). If the DNS lookup returned a temporarily unreachable IP address you’re doomed.
When properly implemented, the getaddrinfo() call could return more than one address associated with the hostname … but that’s not always the case.
Obviously you could write better application code. You could make DNS calls yourself using the resolver library (or parse the information returned by getaddrinfo()), collect all IP addresses and try to connect to more than one of them. Telnet clients usually do that quite well.
You could even implement a connection-failure cache listing those addresses that were recently unreachable to speed up the future session setup process. But let’s be realistic: how many application programmers do you know that really understand the intricacies of TCP/IP (let’s lower the bar: how many of them could use the resolver library)? Most of them want to get their job done and end up using recipes from sources like Network Programming with Perl.
It looks like people writing Yahoo Messenger knew what they were doing; otherwise it wouldn’t make sense to have numerous A records for their IM servers.
The name-to-address mapping problem should have been abstracted into the OS kernel (or system library) decades ago (at the latest when DNS became widespread) and the applications should have been kept blissfully unaware of the complexities; the connect() call should accept a hostname and do the rest behind the scenes. Even Microsoft got that right with the NetBIOS API. But then, what could you expect: the Socket API is a direct mapping to the TCP/IP protocol stack (where DNS is just one of the applications).
With the sorry state of the Socket API, the best you can do if your service is reachable through multiple IP addresses is to randomize the DNS responses (this will give you some limited load sharing), adjust the list of A records in the DNS responses based on server availability (while hoping that the intermediate DNS servers or the clients will not ignore the TTL settings in the DNS responses) … and as the last resort make sure all the IP addresses are always reachable, which brings us back to where we’ve started: IP multihoming. You could also use a load balancer and a single (obviously multihomed) IP address.