Automation isn't just the future for data centers and virtual environments, but our society too. Competing with...
countries that can offer substantially lower labor costs means that automation is the primary business innovation at all levels and in all sectors. IT is no different, and the automation of our data centers has only just begun.
Software-defined networking (SDN), API-driven infrastructure and new protocols like Redfish are changing how we manage our data centers. The days of administrators babying individual applications on individual servers are numbered.
This is not a new topic within the IT industry. A look at VMware's acquisitions and innovations over the past few years should be enough to convince even the most hardened skeptic that the "software-defined anything" movement -- with the end goal of Infrastructure as Code -- is very, very real.
VMware has been building hybrid cloud offerings, self-service computing products, and even created its own OpenStack variant. It bought Niciria in order to get hold of NSX and it built VSAN, VVOLs and implemented policy-based virtual storage backed by role-based administration.
"Software-defined anything" is a very real part of today's virtual landscape. So frequently is this discussed that standardized terminology has emerged to describe varying levels of automation in the data center.
A traditional data center where each server is cared for individually with very little automation is typically referred to as a "pets" data center. Each server is a pet, loved and cared for as you would a house pet.
A data center with high levels of virtualization, operating systems and application automation is typically referred to as a "cattle" data center. Servers no longer have names and aren't loved individually; they have numbers. They are created in an automated fashion and destroyed similarly.
The next step of data center automation is an "ants" data center. Here applications are most likely located in containers instead of VMs. Application data is cleanly separated from the application itself. Every aspect of the infrastructure, from networking to storage, is scripted, monitored, automated and orchestrated. There are very, very few companies in the world that have graduated from cattle to ants.
Beyond ants, however, is yet another level of data center automation. This doesn't have a name yet, but I am going to refer to it as "bacteria." Like bacteria, this level of automation takes the environment into account. Sensors detect environmental changes and these can lead to the migration of workloads from one side of a data center to the other in order to handle thermal excursions.
Like bacteria, which changed the face of the Earth by creating our oxygen-rich atmosphere, a bacteria-class data center can alter the environment it occupies. For example, HVAC systems are be pre-emptively triggered to handle scheduled workloads. Components for failing systems are automatically ordered so that they are ready when algorithms predict node failure. Workloads are moved around the world to better get data and processing close to the users who consume it.
Bacteria-class data centers are also about more than simply automating infrastructure. They would include chaos-monkey-like systems to constantly cause problems and force the developers to adapt. They would include automated A/B testing and more. There are factories for not only the delivery of applications, but the production of good applications and the continual refinement of best practices.
Google may well be the only data center operator to have reached bacteria-class. Netflix and Facebook are close contenders, and Netflix doesn't even run all of its own physical infrastructure.
Getting there from here: SDN in the real world
When you consider just how thoroughly automated a bacteria-class data center like Google is, it can be daunting to consider that you might be asked to turn what you operate into something resembling Google. But whereas Google had to invent much of this technology itself, the tools to reach that level of data center automation are increasingly available off the shelf.
Consider SDN for a moment. The point of SDN is to separate the control plane from the data plane. This is a fancy way of saying that switch configuration is controlled from a central location instead of having to log into each switch and tell it what to do.
In a full-bore SDN setup, the controller software does a lot of calculating to ensure that data can get where it needs to go as fast as it needs using the switches provided and the wiring available. There is a lot of intelligence in a full SDN network, but that doesn't mean that SDN is an all or nothing proposition.
Consider the two modern Supermicro switches in my lab. The first is an SSE-X3648S (48x SFP+ 10GbE/6x QSFP 40GbE) Cumulus Linux Ready Switch. This is a "proper" SDN switch. The second is an SSE-X3348T (48x 10 Gbase-T 10GbE/4x QSFP 40GbE).
The SSE-X3648S, being a Cumulus switch is a "proper" SDN switch. There are many ways to interact with it, and it responds to change very quickly. As there is a proper API, integrating it with various applications is simple. Indeed, Cumulus offers pre-canned integration with a growing number of applications ranging from Dropbox to SAP.
Automating the SSE-X3348T, which is not a Cumulus switch, is a much more old-school affair. It has the ability to be addressed via a Web interface or through a very Cisco IOS-like command line interface (CLI). For data center automation one would use the CLI. This requires writing scripts to communicate with the switches and order it to do things.
Ultimately, the scripts to access these switches become a piece of middleware in their own right. You end up building an API for your applications on one end and a widget to connect to the switches and order them about on the other. Ultimately the SSE-X3348T is clunkier than the SSE-X3648S, and response time for changes is slower.
This doesn't make the SSE-X3348T a bad switch. It does its job and does it well. It is perfectly possible to automate this switch to see the changes you need made within a few seconds. For a traditional "pets-based" data center and even most "cattle-based" data centers, it is a perfectly acceptable switch.
But you won't get to "ants" using the SSE-X3348T. To get to ants you need something like the SSE-X3648S. You need to be able to make changes under a second, to process multiple changes at once. You need a switch whose operating system was designed from the ground up for a fast-paced and dynamic environment.
Switching is a practical example, but not the only one. Storage is another area where the ability to make changes rapidly and in an automated fashion matters. Businesses don't make money resizing LUNs. This is why VMware developed VVOLs. This is why any number of storage technologies and standards have emerged in the past several years to challenge the traditional model of giving the storage administrator a change request and waiting for them to manually assign an application storage.
Today an application needs to be able to grab storage and networking resources in an automated fashion, assign these to the virtual infrastructure, create VMs or containers and inject the relevant operating system, application, configuration and data.
Docker and its ecosystem are helping to automate containers. OpenStack is helping to automate VMs and is providing a platform where storage automation, network automation and more can also come together. VMware is working on its own version, as is Microsoft.
For managing the physical environment the Distributed Management Task Force has developed the Redfish standard. Currently it is a faster, more responsive alternative to Intelligent Platform Management Interface for managing physical servers. The eventual goal is for everything in the data center -- from power strips to networked thermal sensors -- to be addressable via Redfish. Even HVAC systems will ultimately be something that can be controlled through Redfish.
There's an API for everything these days. "Infrastructure as Code" is the phrase of choice. It's pithy, but to the point.
An application isn't automated just because you can install it with a script. It's automated when it can summon all the resources it needs from its environment on creation, release those resources on destruction and is tied into monitoring during operation.
To get there from here pick one problem and start automating. Storage, networking, virtual infrastructure, the field is wide open. Don't try to boil the ocean by doing all at once. Get familiar with one aspect. Start small, but by all means get going.
You don't want to be the company stuck tending a data center of pets when all your competitors are operating with the efficiency of ants.