Scaling the Cloud Security Groups

Most overlay virtual networking and cloud orchestration products support security groups more-or-less-statefulish ACLs inserted between VM NIC and virtual switch.

The lure of security groups is obvious: if you’re willing to change your network security paradigm, you can stop thinking in subnets and focus on specifying who can exchange what traffic (usually specified as TCP/UDP port#) with whom.

Getting rid of subnets? How?

If you’re not familiar with how security groups typically get implemented, you might wonder why I wrote that you can stop thinking in subnets. Here’s the short version of the story.

Security groups are like object groups on Cisco ASA:

  • You specify the VM-to-group membership in the cloud orchestration system;
  • Cloud orchestration system knows which IP address is assigned to which VM and is able to translate the group membership into a set of IP addresses belonging to that group;
  • When you specify group-to-group rules (for example: Web group can communicate with the DB group on MySQL TCP port), the cloud orchestration system (or the network controller) generates an equivalent ACL and installs it in the virtual switch (or iptables).

If you’re considering scalability as part of your network design process, you probably immediately spotted the challenges of this approach:

  • ACL is a Cartesian product of two sets (similar to the OpenFlow 1.0 state explosion) – the length of the ACL is proportional to the product of the group sizes;
  • Most ACL implementations scan the entries sequentially (because networking engineers love to optimize irrelevant stuff and use overlapping ACL entries that make ACLs order-sensitive). ACL performance is thus inversely proportional to the product of group sizes (O(n^2) for those of you that love talking about computational complexity);
  • ACLs have to be updated on all participating virtual switches every time a VM is added to any of the groups used in the ACL.

Oh, and if you want to implement security groups on ToR switches, you’ll quickly realize just how little TCAM they have – you might be better off inserting x86 servers into the forwarding path and using something like Snabb Switch on them.

Can we make it any better?

Sure we can. Instead of blindly converting per-group security rules into IP address ACLs we need a better matching mechanism that would work along these lines:

  • Identify the group membership of the sending VM (trivial on ingress ACL, requires IP lookup on egress ACL);
  • Identify the relevant ACL based on the group membership;
  • Identify the group membership of destination IP address (trivial on egress ACL, requires IP lookup on ingress ACL);
  • Perform ACL matches based on group membership information derived in the previous steps.

This algorithm replaces a single O(n^2) lookup with multiple simple lookups – group membership is a fixed-time lookup if your implementation uses MAC-to-group hash tables, and the time to match an ACL remains proportional to the ACL size, not to the product of group sizes.

4 comments:

  1. We are using this on our vBlock infrastructure today. We provisioned a few, large VLANs and then within the VMWare suite of tools, we built a number of security groups. Some are for protected tiers (PII, PCI, Database), some are for self-contained apps with few outside dependencies. The network (within the distributed virtual switch network) is relatively simple now, and controls are done at the security group layer. It greatly simplifies provisioning.

    For tenants, we still provision a separate VLAN with an edge gateway, and let the tenant provision their own security groups.

    There have been some challenges. If the VLAN is flat, and IP addresses are allocated first-come, first-served, you wind up with databases and middleware in non-contiguous IP space. That is fine, but for firewalls "outside" of the PAAS stack, they have no way to create an "all databases" rule based on a subnet... they now have to put the individual host IPs into the rule. A minor challenge. They are looking into IP Pooling and other options to allow certain server categories to pull from a range within the /20, but these methods are not quite there yet.

    Overall, it has worked well for us.
  2. Or you could use the VID/VNID as the group ID... :-)
  3. Apache CloudStack uses iptables in conjunction with ipsets to achieve this scalability. ipset is a fantastic package. We have real-life uses with tens of thousands of ips in these sets and the lookup speed is still good enough
  4. Thank you! I learn something new every time you chime in.

    In the iptables case you were fortunate enough to be able to change the matching algorithm keeping the complexity down to O(n.log(m)). Sometimes you're bound by the hardware limitations (no way to implement ipset in TCAM) or information transfer protocol (no reasonably simple support for ipset-like construct in OpenFlow).
Add comment
Sidebar