Migrating ipSpace.net Infrastructure to AWS

I’m too stupid to unwind and relax over summer - there’s always some janitorial task to be done, and I simply cannot leave it alone. This summer, I decided to migrate our server infrastructure to AWS.

TL&DR: It went smoother than I expected, and figuring out how AWS virtual networks, public IP addresses, and security groups work while creating AWS Networking webinar definitely helped, but it also took way longer than I expected.

Automate Everything

AWS has nice migration tools. In theory you could:

  • Create your environment with AWS GUI;
  • Migrate your VM image to AWS;
  • Start your VM and go for a beer.

For whatever reason I decided to do “the right thing”, create AWS environment with a repeatable (or idempotent if you’re trying to win REST API bingo) script, and rebuild the server from scratch using Ansible playbooks.

It took me weeks to get the job done (in parallel with a gazillion other things I have to do), but it was worth it - now I have tested recipes I can use to recreate the whole infrastructure from scratch, and I already used the server provisioning playbooks to create a development copy of the server on my laptop.

Automation Win: the development server is created using the same playbooks as the production server and thus guaranteed to have the exact same software environment. Once I test the changes on development server it’s reasonably safe to deploy them in production.

Keep in mind: if you want to keep the two environments in sync, you should never ever install a package, or a Python/Perl/PHP/whatever module manually. The only way to get new third-party software into development environment (and later into production) is by modifying and executing Ansible playbooks.

Lessons Learned

Ansible is awfully slow when used against a remote server. I decided to deploy the server in US East region (pricing might have something to do with that), resulting in a bit over 100msec RTT… and doing the simplest tasks (like collecting server facts) with an Ansible playbook took forever. It would be better to copy the playbooks to the server as the first provisioning step and execute them locally.

AWS CLI is rudimentary at least when compared to Azure CLI (the one really good thing I like about Azure). As I explained in an infrastructure-as-code blog post REST API can quickly turn into a CRUD hell requiring an adaptation layer on top if you want to think in terms of desired end-state.

That adaptation layer is built into Azure CLI (more about that in an upcoming post), but totally missing in AWS CLI. I should have realized that and used a dedicated infrastructure-as-code tool like Terraform, but I was too lazy for that and managed to wing it with Bash scripts. Never again.

8 comments:

  1. You're right, Ansible is slow and specifically with high latency connections , but maybe you can speedup your deployment with https://github.com/dw/mitogen .

    Replies
    1. Thanks a million! This one looks really interesting!
  2. I think you really miss a chance here. You should use both Terraform and Ansible. Terraform for infrastructure deployment (declarative) and Ansible for configuring your VMs (imperative). You'll get the most out of them. You could treat your Terraform configuration files as code with all the added benefits like version control for example.
    Replies
    1. Yeah, I should replace my Bash scripts using AWS CLI with Terraform. Lesson learned ;)
  3. Also Ivan, You might want to migrate from AWS to GCP now!
    Replies
    1. Yeah, this https://killedbygoogle.com/ really inspires a lot of confidence...
  4. Better yet, use immutable deployments and deploy AMIs as an artifact, not code onto existing set of EC2s.

    Major win and you can dump all of ansible since you don't need ansible to deploy to a single server.

    Major major win: serverless all the things and then you don't need anything beyond SAM. :)
    Replies
    1. While I agree that it would be great to use immutable deployments, it's probably an overkill for my single VM environment (don't need anything more sophisticated for the current workload). Also, rebuilding and restarting the VM every time I'm deploying new code is a bit tedious, not to mention I'd have to go for a back-end database and front-end load balancer to have reasonable availability.

      As for "to Ansible or not to Ansible" - yeah, I could do everything with Bash scripts as most things Ansible modules use (like yum, pip, cpan...) are idempotent anyway, but when you have a shiny new hammer lying on a bench nearby...

      Finally, have to look into lambda details, but I have a nasty feeling that moving to serverless would trigger a major code rewrite, so it would be more hassle than it would be worth. For new stuff however...
Add comment
Sidebar