Friday, July 20, 2012

Workload Mobility Is More Real Than You Might Think

One of the many holy grails in data center architectures has been the notion of workload mobility: the ability to pick up an arbitrary set of applications (and their data!), move them over a distance, and do so with an absolute minimum of effort and disruption.

It's an incredibly useful capability, especially if you've got multiple data centers and veritable zoo of applications in your menagerie. 

Move apps to get to newer hardware.
Move apps to get more performance.
Move apps to save some money.
Move apps to rebalance.
Move apps because you need to take some infrastructure off-line.
Move apps to increase protection levels.
Move apps because you've got a new data center location.

No shortage of good, practical reasons of why you'd occasionally want to move a set of workloads.

They put casters on heavy appliances for a reason!

But moving applications around has always been a complex and disruptive pain -- lots of planning, lots of coordination, lots of downtime.  Not the sort of thing that IT professionals warmly embrace with enthusiasm and passion.

But -- for some IT shops -- that's started to change.  And we'll see more in the near future -- I'm sure of it.

Why Is Moving Applications So Hard?

By "moving", what I'm really talking about is  "moving an application from one data center to another, separated by distance".   If you've never contemplated what's involved, you might be asking -- what could be so hard?

The data has to be moved.  The application has to be shut down in one location and restarted in the new one.  Network addresses have to be updated.  The "supporting cast" -- backup, security, management, monitoring, etc. has to be notified and perhaps reconfigured.

On and on and on.

Think of everything you'd have to do to move your home between two states.  For me, it hurts my head just thinking about it.  Lots of interrelated and sequential activities, with  significant disruption involved.  And, of course, no shortage of complaints from the family during the process.

The inherent friction means you won't do it very often -- unless there's a compelling set of circumstances.

Now, take away almost all of the friction.  Take away almost all the complexity.  Take away the dependencies and sequential processes.  Once configured, easily move the whole shebang from here to there anytime there's a good reason -- no drama, no fuss, no complaints.  Just pack and go.

To those who've been in the IT business for a while, this might sound like science fiction.  Well, so did quantum entanglement -- at one time.

But -- very quietly -- workload mobility has now started become a core capability that's getting routinely baked into IT infrastructure.

Meet Katten Muchin Rosenman LLP

They're a good-sized law firm -- 600 attorneys.  What makes them exceptional in this discussion is that -- well -- they don't appear to be particularly unique from an IT perspective.

Their law firm has to do the same sort of bread-and-butter IT stuff that law firms around the world have to do.  Like most of their peers, they have to provide a wide range of capabilities to demanding professionals, keep service levels high, while watching the expense line.

Interesting, but not exactly bleeding-edge IT stuff.  And that's the point.

The press release tells the story.  Business has been good for them.  They needed more data center capacity.  Not a bad problem to have in the general scheme of things.

Rather than simply find a larger facility, the EMC team presented a scenario of active/active data centers where workloads could easily move back and forth with a minimum of hassle.  Keep what you have, just add another increment of data center capacity, and think of it all as one, dynamic pool.

Katten was fortunate in that most of their environment was already fully virtualized using VMware.  And, based on their EMC relationship, they were willing to give the EMC VPLEX approach a try.

Enter EMC's VPLEX

There's a lot to VPLEX, but -- at its core -- it uses very sophisticated caching technology to make data appear in two places at the same time when needed.  That's a very useful trick when you're moving applications around non-disruptively.

It's especially good at doing this with "hot" transactional data and traditional enterprise applications -- as you'd find with   a busy email system, billing database and so on.

VplexVPLEX is currently packaged as an appliance that sits in the data path -- typically a redundant pair at each end of the network.  It works with most popular enterprise storage arrays, including non-EMC ones. 

I wouldn't recommend trying to compare it with similar products, because -- today -- there's nothing else that does what it does.

By itself, it's pretty darn capable.  But couple VPLEX with VMware's vMotion, and you've got a very complete and very robust workload mobility solution.

Since the VPLEX introduction a few years back, it's quietly turned into (yet another) one of those EMC innovative technology success stories.  For example, the VCE folks now routinely use VPLEX to do cool workload mobility demos and zero RPO / zero RTO failover demos on Vblocks.

Life Made Easier At Katten

At Katten, over 250 production applications were moved from one data center to another. With no drama, and no disruption -- and no one the wiser.

Alexander Diaz of Katten offered this observation in the press release:

"For a major data center migration, we were able to move a running virtual machine across our cloud to the new data center 25 miles away in about 15 to 20 seconds with VPLEX and vMotion. With VPLEX's capabilities, I have confidence that we could move the data across an even further distances should our data center needs evolve over time."

"VPLEX kept the storage in synch in both data centers. Four engineers moved 30 to 40 virtual machines the first weekend and then gradually moved over 250 systems during the next three weeks. The servers stayed up the whole time and no one in the firm knew that we had migrated our entire data center. The 'old' way would have meant days or up to a week of downtime for certain systems and a dozen engineers working around the clock."

"VPLEX has allowed us to raise the bar and provide our firm with enterprise-class business continuity—and a truly active/active data center model. Using the VPLEX for our virtual machines and stretch clusters, we can do maintenance and upgrades on hardware whenever it's needed without any downtime. We're also getting more utilization out of our infrastructure by balancing workloads across multiple sites."

"When someone in the firm needs a new application, we're expected to respond quickly. EMC technologies along with virtualization enable us to bring up a new system in about 40 minutes—start to finish. Before, it would take weeks. Sometimes our users think it is magic, so you could say VPLEX gave us a special wand to get the job done".

Here's what I think is cool about this story: we're not talking about a bleeding edge web company, or an intergalactic financial institution, or a military research lab or similar IT exotica.  Katten is a very successful large law firm, and -- as such -- they use IT to get their work done -- it is clearly not an end in itself.

Workload mobility was one of the tools they had access to, and they decided to use it to their advantage.  Not just for this project, but to create a capability they could come back to over and over again.  Good work, guys.

But -- step back a bit -- and perhaps you'll agree with me that this is just the beginning of something much, much bigger.

Scale-Out Comes To Aggregations Of Data Centers?

When you contemplate computing or storage architectures, you quickly get enamored with the notion of properly implemented scale-out architectures that aggregate smaller resources into much bigger pools.

Start small.  Add more performance and capacity in small increments, when needed.  Automatically and transparently readjust workloads and resources as usage patterns change.  Improve redundancy at lower costs.  No downtime.  Use older gear and newer gear together.  Get real efficient.  Get real fast.  Or anything in between at any time.

Why? All resources are one, big seamless pool with no significant walls or boundaries.  That's what scale-out can do for you -- if done right.

Then you start considering cloud, and big data -- and you inevitably come to the conclusion that -- yeah -- this is the way things are going to be done everywhere before too long.  If you're a technology vendor, game on!

But at one level, a data center is really nothing more than a physical container for computing resources -- a really big, complicated server if you will.  The same benefits that come from aggregating computing and storage resources using scale-out approaches within the data center could potentially apply across multiple data centers that have many of the same properties.

Start with a small data center.  Add more performance and capacity in small increments, when needed.  Automatically and transparently readjust workloads and resources as usage patterns change.  Improve redundancy at lower costs.  No downtime.  Use older sites and newer sites together. Get real efficient, get real fast, or anything in between.

But it's one thing to virtualize, pool and create scale-out technologies in the confines of a single data center with very short distances and trivial latencies.  It's another thing entirely to do the same sort of thing with meaningful distances and significant latencies involved.

Years ago, I jokingly described the thought as RAID -- a redundant array of inexpensive datacenters.  Eliminate the impact of moving things over a distance, and how you thought about data centers would change drastically.  VPLEX hadn't been publicly announced yet, but it was clear where the technology could eventually lead up over time.

lI've started to use the phrase "virtualizing  distance" to help describe what's needed here.  Some people here at EMC use the term "federation" or "dissolving distance" to describe similar concepts.

To each their own.

Regardless of the terms used, the ultimate goal is to make the appearance of data center distance to disappear as much as possible -- just as we would want the appearance of "distance" to disappear between pooled scale-out servers and storage nodes in a local setting.

Removing those barriers between resources -- and enabling them to be easily and dynamically pooled -- is what scale-out is really all about.

True for servers.  True for storage.  Also true for multiple data centers.

How Do You Virtualize Distance?

We could wait for someone to crack the speed-of-light problem, but I'm not optimistic.  On a more pragmatic note, I would argue that there are three core technologies needed to effectively virtualize distance for these use cases.

One core technology is the need to virtualize network addressing and topologies over distance and separate domains.  Your world needs to look like one, big, flat network where IP addresses could move around if needed.  That's what Cisco's OTV technology does well, among other things.

Another is the need to virtualize and encapsulate server resources so they can be moved.   Obviously, that's something that VMware's vMotion does uniquely well.   But there's a problem with moving the data -- especially if you want to minimize disruption.

Essentially, you're going to want updated data to be in two places at the same time during the move. That's what VPLEX does uniquely well, amount other things.

Use these three technologies separately, or use them integrated together in something like a VCE Vblock if you choose.  It's there today, and it works quite well.  But I don't think everyone sees the implications just yet.

For me, it's pretty clear: I see people starting to think about data centers differently.  Architectural patterns tend to repeat themselves at different scale; just as we clearly see scale-out concepts infuse server and storage design, we're also seeing scale-out concepts slowly filtering their way into aggregations of data centers.

Perhaps it won't be too long before we think of "adding and rebalancing a data center" much the way we routinely think about adding and rebalancing a compute or storage node today.

And, given the ginormous amount of IT resource that goes into data center planning, construction, implementation, operations, etc. -- that particular shift in thinking is going to end up being a really big deal.

Many Of Our Architectural Assumptions Are Changing-- As They Should Be

Virtualization has changed the way we think about compute.  Tablets have changed the way we think about end user compute.  Java has changed the way we think about writing code.  Flash has changed the way we think about storage performance.  Cloud has changed the way we think about producint and consuming IT services.

On and on and on -- no shortage of the overused "paradigm shifts" to choose from.

Perhaps that's why so many of us are attracted to this space -- there's so much changing all the time.

To this long list, I now want to add "virtualizing distance" -- as evidenced by technologies such as VPLEX -- where we start to think of distance as something to exploit to our advantage -- as opposed to something that has to be merely overcome.


By Chuck Hollis