OMG, Who Will Manage All Those Virtual Firewalls?

Every time I talk about small (per-application) virtual appliances, someone inevitably criesAnd who will manage thousands of appliances?” Guess what – I’ve heard similar cries from the mainframe engineers when we started introducing Windows and Unix servers. In the meantime, some sysadmins manage more than 10.000 servers, and we’re still discussing the “benefits” of humongous monolithic firewalls.

It won’t be easy

You’ll obviously have to change your processes and tools as you go from configuring uncountable firewall rules on a pair of gigantic pets to deploying a large herd of small appliances – here are a few tips that might help you.

Standardize. There might be thousands of applications in a typical large enterprise, but I’m positive they aren’t as unique as their developers think. Identify typical patterns (example: web application using external SQL Server) and standardize the network services needed by the application classes.

Simplify. Traditional firewall and load balancer rules/configurations are complex because we force a single box to manage hundreds or thousands of “unique” endpoints, each one identified by its IP address. In an appliance-per-application world the rules become much simpler, for example:

  • Outside IP address is load balanced across all hosts in the web segment;
  • Everyone can access port 80 on any host in the web segment;
  • Every host in the web segment can access MySQL port on any host in the database segment and Memcached port on any host in the caching segment.

Simple, easy to understand, audit and manage.

Templatize. Once you have simple rules for standard application classes, generate configuration templates or golden images. Every time the requirements change, change the template, test it, and deploy hundreds of new VMs instead of manually changing firewall rules for every application/host.

Automate. Manual processes never scale, as craftsmen of all trades discovered throughout the history. If you want to roll out thousands of appliances, you have to automate their deployment, change management and monitoring.

The good news: all recent virtual appliances have an API that you can use to automate them. The bad news: someone will have to learn how to use that API and write scripts.

Delegate. Once you did your homework, identified typical application patterns, created simple rules, and prepared virtual appliance templates that can be automatically deployed from a central catalog, you have to let go. Application teams should take ownership of individual virtual appliance instances – these instances become just another VM in the application stack.

Start now

It’s impossible to get from the rigid environment of oversized physical appliances to the virtual appliance nirvana in a single giant leap, but there’s nothing preventing you from taking the first steps.

Don’t try to fix the existing nightmare – some of it can be migrated to the virtual appliances world once the new concepts have been proven, and some applications will simply have to die before you can get rid of the old world. Focus on the new applications that are in the design stage.

Don’t preach the new ideas to everyone you bump into. Identify the most flexible application development team in your organization and start working with them – once everyone else sees the benefits of the new approach, they just might decide to join you.

Need help?

If you need help designing your next-generation private cloud, get in touch. You’ll also find plenty of details in my virtualization webinars:

All webinars are included with the yearly subscription.

13 comments:

  1. The keyword here is "simplify." We must define, clearly, what base firewall and load-balancer functionality means, then define the data (configuration and state) used to manage the firewalls. Not only will this simplify mgmt, but it lends itself to automatability. Or whatever the right word there is.
  2. Spot on Ivan. To me every technology and idea in networking today is most likely a transitive technology. Yes today in the firewall world this is tough to architect but I see the light at the end of the tunnel. It is just a long tunnel.
  3. www.dome9.com solves that problem
  4. The "simplify" idea is really the key here, the only little problem is that it is in direct contradiction to the security/audit team requirements. It is them who want to have the fw rules as tight as possible. For example it is impossible for them to have one simple rule to "allow all web frontends to talk to all sql backends"
    You might say the exposure will be smaller with small per-application appliances, which is true, but still not enough for security/audit team. Anything looser is unacceptable for them, so convincing them otherwise will be hard.
    Replies
    1. In glad you mentioned this as I was about to say the same thing. In curious to know how Ivan reconciles the "any web server to any SQL server" approach with the "per service tenant" approach. Perhaps he is talking in this context in which case all web servers means 'all web servers for that particular service' and not 'all web servers in your domain?
    2. Is my English really that bad ... or is the idea so outlandish? I wrote "in an appliance-per-application world the rules become simpler". Of course I meant "all web servers of a particular application"
    3. My apologies, Ivan, I obviously overlooked that critical statement. Having said that, most security auditors will question why web server X can access web sql server Y if thexplicitly require it as they are usually single mindedly focused on reducing the attack surface area, regardless of other mitigations or other pragmatic considerations.
    4. Apologies for typos, editing on mobile devices isn't great. That should read "if they don't explicitly require it".
  5. 'Simplify' while deploying hundreds of firewalls.

    I can't imagine any security audit is going to allow an 'any any' rule through their gateways.

    And 'let your application team manage the firewall'. mmmmhmmm.
    Replies
    1. I don't think this will require an 'any any' rule, just a specific well defined group to be allowed access to another specific well defined group, albeit not everything in the first group would need access to everything in the second.

      Despite my comments above, I am very much in favour of an approach along these lines. What we are looking at is a step towards this goal which is not to have per-service firewalls (at least not yet) but to have per-environment firewalls and in some cases, several pairs of firewalls per environment. This is a half way house between monolithic firewalls and per-service firewalls. We are also using templated rules which get applied to servers when needed, rather than a blanket allow rule between web tier and app tier etc.

      One of the issues with per-service firewalls (and this is purely a management/housekeeping issue, rather than a technical one) stems from the difficulty it is in getting people to agree to decommission systems. Service owners like to keep systems running 'just in case' or because they have re-used them for some other unrelated task. This behaviour often contributes to VM sprawl and is likely to lead to firewall sprawl too, with its associated licensing and management overhead. But as I say, this is a management issue and if you can get to grips with that, then it is no longer a barrier.

      The bigger barrier is the security auditor wanting to reduce the surface area of attack and consequently putting the brakes on such a per-service firewall deployment with broad rules for comms between server types. Again, I have tried to address this by making the case that if a single server has been compromised, you have to assume the whole service in that environment has been compromised. With such a perspective, it doesn't matter that the compromised web server can access an app server it doesn't strictly need to because the sysadmins focus should be on detection of the incident and locking it down whilst relying on other multi-layer security measures to mitigate the impact in the meantime. Obviously, any critical service would have redundancy/resilience built in and so you would keep the service alive through your secondary system during the lockdown of your primary.

      In conclusion, I think Ivan's point is valid, it just requires all parts of your team (including auditors) to take a similar pragmatic approach.
  6. In IAAS systems such as AWS and Apache CloudStack, security groups offer scalable isolation like you mention. Instead of virtual firewalls, we can use the hypervisor's firewall capabilities (iptables in Xen/KVM). Apache CloudStack can comfortably manage tens of thousands of firewalls and hundreds of thousands of firewall rules (see http://www.slideshare.net/chiradeep_v/scalable-networking-in-apache-cloudstack slides 32-41)
  7. I feel that security thinking does need to evolve along with the technology. The business is pushing for self-service, automation, and instant provisioning - and in most cases there is a business driver behind it - so it's becoming a requirement, not a nice-to-have.

    I've been involved in some tire-kicking on both public and private cloud for an enterprise. Inevitably, security is running around behind the network architecture trying to plug holes. Yes, the developers are getting the instant provisioning and shorter delivery times to the market/customer. Then an auditor comes in and the security group has to clean up. When they are done, some people are frustrated because now they have to follow a process to allow new connectivity whereas it used to "just work".

    Again, I think Ivan's previous suggestion of making every app a tenant is good. Multiple virtual firewalls as mini point solutions can also be good. But here are some real challenges:

    1. Some developers don't know what all of the dependencies of their app are. I can't tell you how many times they build an app, request the firewall exceptions, and spend a week amending them because they forgot that this app calls some other middleware piece.
    2. Virtualized appliances are great, but they still are constrained to Layers 1-7, and in many cases, still constrained to Layers 1-4. If we are dealing with large, flat networks or even huge IPv6 networks, you will have databases, apps, web and other things landing all over the IP space. Yes, you can put things in security groups or whatever your orchestrator uses to define security policies, but a lot of appliances don't recognize security groups. They still look for this IP talking on this port, etc... If the security groups could be tagged and if the appliances understood these tags, you could land databases wherever, and have the correct policies applied automatically. For most things
    3. Legacy. Some of these new, whizbang environments will contain servers that talk to servers in old, tired legacy environments. We see them autoprovisioning in the private/hybrid cloud quickly and automatically... then they need it to pull from a data warehouse somewhere and now it's back to the security request.

    one more time, I think this is the right direction but I sincerely think there needs to be some technological breakthroughs before we are there.

    CWB
  8. guys, as everyone knows, money/profitability always comes first. And people only think about security when there has been a huge breach (and even then think about it temporarily and then go back to before)

    'let your app team (or sometimes end user) manage their own firewall' is the principle behind all cloud offerings, which is why they are the Largest 'Hack Me' networks of all time.
Add comment
Sidebar