Category: cloud
Azure Route Server: Behind the Scenes
Last week I described the challenges Azure Route Server is supposed to solve. Now let’s dive deeper into how it’s implemented and what those implementation details mean for your design.
The whole thing looks relatively simple:
Azure Route Server: The Challenge
Imagine you decided to deploy an SD-WAN (or DMVPN) network and make an Azure region one of the sites in the new network because you already deployed some workloads in that region and would like to replace the VPN connectivity you’re using today with the new shiny expensive gadget.
Everyone told you to deploy two SD-WAN instances in the public cloud virtual network to be redundant, so this is what you deploy:

Implementing Layer-2 Networks in a Public Cloud
A few weeks ago I got an excited tweet from someone working at Oracle Cloud Infrastructure: they launched full-blown layer-2 virtual networks in their public cloud to support customers migrating existing enterprise spaghetti mess into the cloud.
Let’s skip the usual does everyone using the applications now have to pay for Oracle licenses and I wonder what the lock in might be when I migrate my workloads into an Oracle cloud jokes and focus on the technical aspects of what they claim they implemented. Here’s my immediate reaction (limited to the usual 280 characters, because that’s the absolute upper limit of consumable content these days):
Impact of Azure Subnets on High Availability Designs
Now that you know all about regions and availability zones (AZ) and the ways AWS and Azure implement subnets, let’s get to the crux of the original question Daniel Dib sent me:
As I understand it, subnets in Azure span availability zones. Do you see any drawback to this? You mentioned that it’s difficult to create application swimlanes that way. But does subnet matter if your VMs are in different AZs?
It’s time I explain the concepts of application swimlanes and how they apply to availability zones in public clouds.
Virtual Networks and Subnets in AWS, Azure, and GCP
Now that we know what regions and availability zones are, let’s go back to Daniel Dib’s question:
As I understand it, subnets in Azure span availability zones. Do you see any drawback to this? Does subnet matter if your VMs are in different AZs?
Wait, what? A subnet is stretched across multiple failure domains? Didn’t Ivan claim that’s ridiculous?
TL&DR: What I claimed was that a single layer-2 network is a single failure domain. Things are a bit more complex in public clouds. Keep reading and you’ll find out why.
Availability Zones and Regions in AWS, Azure and GCP
My friend Daniel Dib sent me this interesting question:
As I understand it, subnets in Azure span availability zones. Do you see any drawback to this? Does subnet matter if your VMs are in different AZs?
I’m positive I don’t have to tell you what networks, subnets, and VRFs are, but you might not have worked with public cloud availability zones before. Before going into the details of Daniel’s question (and it will take us three blog posts to get to the end), let’s introduce regions and availability zones (you’ll find more details in AWS Networking and Azure Networking webinars).
Deploying Advanced AWS Networking Features
Miha Markočič created sample automation scripts (mostly Terraform configuration files + AWS CLI commands where needed) deploying these features described in AWS Networking webinar:
- IP multicast deployment (video)
- Web Application Firewall deployment (video)
- Network Load Balancer deployment (video)
- Inter-region VPC Peering deployment (video)
To recreate them, clone the GitHub repository and follow the instructions.
Worth Reading: Cloud Complexity Lies
Anyone who spent some time reading cloud providers’ documentation instead of watching slide decks or vendor keynotes knows that setting up infrastructure in a public cloud is not much simpler than doing it on-premises. You will outsource hardware management (installations, upgrades, replacements…) and might deal with an orchestration system provisioning services instead of configuring individual devices, but you still have to make the same decisions, and take the same set of responsibilities.
Obviously that doesn’t look good in a vendor slide deck, so don’t expect them to tell you the gory details (and when they start talking about the power of declarative API you know you have a winner)… but every now and then someone decides to point out the state of emperor’s clothes, this time Gerben Wierda in his The many lies about reducing complexity part 2: Cloud.
For public cloud networking details, check out our cloud webinars and online course.
Podcast: IPv6 in the Cloud
In December 2020 Ed Horley invited me to a chat about IPv6 in the public cloud. While I usually don’t want to think about a protocol that’s old enough to buy its own beer in US, we nonetheless had interesting discussions (including the need for frequent RA messages in AWS VPC).
Why Is Public Cloud Networking So Different?
A while ago (eons before AWS introduced Gateway Load Balancer) I discussed the intricacies of AWS and Azure networking with a very smart engineer working for a security appliance vendor, and he said something along the lines of “it shows these things were designed by software developers – they have no idea how networks should work.”
In reality, at least some aspects of public cloud networking come closer to the original ideas of how IP and data-link layers should fit together than today’s flat earth theories, so he probably wanted to say “they make it so hard for me to insert my virtual appliance into their network.”
Renumbering Public Cloud Address Space
Got this question from one of the networking engineers “blessed” with rampant clueless-rush-to-the-cloud.
I plan to peer multiple VNet from different regions. The problem is that there is not any consistent deployment in regards to the private IP subnets used on each VNet to the point I found several of them using public IP blocks as private IP ranges. As far as I recall, in Azure we can’t re-ip the VNets as the resource will be deleted so I don’t see any other option than use NAT from offending VNet subnets to use my internal RFC1918 IPv4 range. Do you have a better idea?
The way I understand Azure, while you COULD have any address range configured as VNet CIDR block, you MUST have non-overlapping address ranges for VNet peering.
Don't Lift-and-Shift Your Enterprise Spaghetti into a Public Cloud
Jon Kadis spent most of his life working on enterprise networks, and sadly found out that even changing jobs and moving into a public cloud environment can’t save you from people trying to lift-and-shift enterprise IT kludges into a greenfield environment.
Here’s what he sent me:
Worth Reading: The Shared Irresponsibility Model in the Cloud
A long while ago I wrote a blog post along the lines of “it’s ridiculous to allow developers to deploy directly to a public cloud while burdening them with all sorts of crazy barriers when deploying to an on-premises infrastructure,” effectively arguing for self-service approach to on-premises deployments.
Not surprisingly, the reality is grimmer than I expected (I’m appalled at how optimistic my predictions are even though I always come across as a die-hard grumpy pessimist), as explained in The Shared Irresponsibility Model in the Cloud by Dan Hubbard.
For more technical details, watch cloud-focused ipSpace.net webinars, in particular the Cloud Security one.
Podcast: State of Multi-Cloud Networking
In mid-September Ethan Banks invited me to chat about multi-cloud networking in the Day Two Cloud podcast. It was just a few weeks after Corey Quinn published a fantastic Multi-Cloud is the Worst Practice rant, which perfectly matched my observations, so I came well prepared ;)
Automation Win: Recreating Cisco ACI Tenants in Public Cloud
This blog post was initially sent to the subscribers of our SDN and Network Automation mailing list. Subscribe here.
Most automation projects are gradual improvements of existing manual processes, but every now and then the stars align and you get a perfect storm, like what Adrian Giacommetti encountered during one of his automation projects.
The customer had well-defined security policies implemented in Cisco ACI environment with tenants, endpoint groups, and contracts. They wanted to recreate those tenants in a public cloud, but it took way too long as the only migration tool they had was an engineer chasing GUI screens on both platforms.