One of the first hands-on exercises in our Networking in Public Cloud Deployments asks the attendees to automate something. They can choose the cloud provider they want to work with and the automation tool they prefer… but whatever they do has to be automated.
Most solutions include a simple CloudFormation, Azure Resource Manager, or Terraform template with a line or two of README.MD, but Erik Auerswald totally astonished me with a detailed and precise writeup. Enjoy!
Every now and then someone tries to justify the “wisdom” of migrating VMs from on-premises data center into a public cloud (without renumbering them) with the idea of “scaling out into the public cloud” aka “cloud bursting”. My usual response: this is another vendor marketing myth that works only in PowerPoint.
To be honest, that statement is too harsh. You can easily scale your application into a public cloud assuming that:
Considering VMware’s enrapturement with vMotion the following news (reported by Salman Naqvi in a comment to my blog post) was clearly inevitable:
I was surprised to learn that LIVE vMotion is supported between on-premise and Vmware on AWS Cloud
What’s more interesting is how did they manage to do it?
Unless you’re working for a cloud-only startup, you’ll always have to connect applications running in a public cloud with existing systems or databases running in a more traditional environment, or connect your users to public cloud workloads.
Public cloud providers love stable and robust solutions, and they took the same approach when implementing their legacy connectivity solutions: you could use routed Ethernet connections or IPsec VPN, and run BGP across them, turning the problem into a well-understood routing problem.
Enterprise environments usually implement “mission-critical” applications by pushing high-availability requirements down the stack until they hit networking… and then blame the networking team when the whole house of cards collapses.
Most public cloud providers are not willing to play the same stupid blame-shifting game - they live or die by their reputation, and maintaining a stable service is their highest priority. They will do their best to implement a robust and resilient infrastructure, but will not do anything that could impact its stability or scalability… including the snake oil the virtualization and networking vendors love to sell to their gullible customers. When you deploy your application workloads into a public cloud, you become responsible for the resiliency of your own application, and there’s no magic button that could allow you to push the problems down the stack.
Doing the same thing and hoping for a different result is supposedly a definition of insanity… and managing public cloud deployments with an unrepeatable sequence of GUI clicks comes pretty close to it.
Engineers who mastered the art of public cloud deployments realized decades ago that the only way forward is to treat infrastructure in the same way as any other source code:
Listening to public cloud evangelists and marketing departments of vendors selling over-the-cloud networking solutions or multi-cloud orchestration systems, you could start to believe that migrating your workload to a public cloud would solve all your problems… and if you’re gullible enough to listen to them, you’ll get the results you deserve.
Unfortunately, nothing can change the fundamental laws of physics, networking, or application architectures:
If you’re running a typical (somewhat outdated) enterprise data center, you’re using tons of VLANs and firewalls, use VLANs as security zones, and push inter-VLAN traffic through firewalls for inspection. Security vendors love that approach - when inspecting traffic they can add no value to (like database- or backup sessions), the firewalls quickly become choke points that have to be upgraded.
Here’s an interesting tidbit from “Last Week in AWS” blog:
From a philosophical point of view, AWS fundamentally considers an API to be a promise. Services that aren’t promoted anymore are still available […] Think about that for a second - a service launched 13 years ago is still actively supported to the point where you can use it today.
The amount of layer-2 tricks we use to make enterprise networking work never ceases to amaze me - from shared IP addresses used by various clustering solutions (because it’s too hard to read the manuals and configure DNS) to shared MAC addresses used by first-hop router redundancy protocols (because it would be really hard to send a Gratuitous ARP message on failover) and all sorts of shenanigans we’re forced to engage in to enable running servers to be moved willy-nilly around the Earth.
Design assignments and hands-on exercises were always a big part of ipSpace.net online courses, and our new Networking in Public Cloud Deployments course is no different.
You’ll start with a simple scenario: deploy a virtual machine running a web server. Don’t worry about your Linux skills, you’ll get the necessary (CCIE-level) instructions and the source code for the web server. Building on that, you’ll create another subnet and deploy another virtual machine acting as a back-end application server.
And then we’ll get to the fun part:
You need Free ipSpace.net Subscription to watch the video, and the Standard ipSpace.net Subscription to register for a deeper dive into cloud security with Matthias Luft (next live session on December 10th: Identity and Access Management).
One of the responses to my Disaster Recovery Faking blog post focused on failure domains:
What is the difference between supporting L2 stretched between two pods in your DC (which everyone does for seamless vMotion), and having a 30ms link between these two pods because they happen to be in different buildings?
I hope you agree that a single broadcast domain is a single failure domain. If not, let agree to disagree and move on - my life is too short to argue about obvious stuff.
You’ve probably heard cloudy evangelists telling CIOs how they won’t need the infrastructure engineers once they move their workloads into a public cloud. As always, whatever sounds too good to be true usually is. Compute resources in public clouds still need to be managed, someone still needs to measure application performance, and backups won’t happen by themselves.
Even more important (for networking engineers), network requirements don’t change just because you decided to use someone else’s computers:
- Joep Piscaer will dive into what changes public clouds bring and what these changes mean for you, as well as what developers and other consumers of cloud resources expect from you in the new public cloud, DevOps and Infrastructure-as-Code world.
- Ned Bellavance will review the principles of Infrastructure as Code (IaC) and how they apply to public cloud solutions. Then he will take a look at the landscape of IaC tools that exist and examine their pros and cons.
- Howard Marks will review the types of storage available across public clouds, how they differ between cloud providers and the applications and pitfalls associated with each of them.
- Connecting on-premises data centers or office locations to a public cloud has some unique challenges. Ed Horley will help you create a framework and a checklist to make sure you have the required redundancy, throughput, routing, and security all baked in from day one.
- Matthias Luft will cover the aspects of securing your public cloud deployments.
- Justin Warren will explain how to make good tradeoffs between resilient hardware and resilient software.
Sounds interesting? The first Networking in Public Cloud Deployments course will start on February 11th, 2020, but the minute you register you'll be able to start studying the materials (over 100 hours of content). There’s just one thing you have to do: click the Register button.