Tuesday, July 19, 2011

The Importance Of Friction When Considering Cloud

As I watch the industry talk-track around cloud and IT-as-a-service slowly evolve, I'm starting to get a bit ticked off.


I think in many cases the various industry cloud pundits may be doing people a disservice.
They're a passionate bunch, for sure, but I think -- in some cases -- they're losing sight of a few important real-world considerations that have absolutely nothing to do with technology, and everything to do with how people consume shared resources.
If I think back, I've perhaps been as guilty as anyone, but I've seen things in a new light for quite a while now.

My Rant?
The industry talk track on cloud and IT-as-a-service model has generally evolved about making IT easier to consume -- in essence, removing various forms of friction and inefficiency for those providing the services as well as those consuming it.
But everything has its limits.

Less friction?  Good.  No friction?  Not good.

Here's why ...

The Basics
I've now been fully engaged in this whole cloud thing for about three years here at EMC.  The talk track inside and outside of EMC continues to evolve and mature, but -- for me -- it can't happen fast enough.  There's a lot of progress that's been made collectively, but we still have work to do.
One good example of evolution is the discussion around "cost".
If you'll remember, the original cloud discussion is "you want cloud because it's so much darn cheaper than everything else".  Well, maybe yes, and maybe no -- depending on the specifics of the situation.
Certainly, there's usually a strong case that can be made, but it's not a uniform statement.  And there are always non-trivial costs and effort to get to that envisioned state.

Going farther, taking various forms of cost out is just table stakes to so many of us business consumers of IT.  Sure, we like cheap, who doesn't?  But what most of us really lust after is speed, flexibility and agility.  Give us 80% of what we want in a very short time frame, and we'll debate the other 20% later, thanks.

But -- as business consumers of IT -- we can be a selfish bunch.  We tend to focus on what we individually want for our pet concerns.

Not that we're completely insensitive idiots, it's just that we expect other folks to be looking out for The Big Picture.

Anything that's easy to consume will be consumed more -- that's human nature.  Inevitably, ease of consumption leads to a well-understood "tragedy of the commons".  In one sense, this is not a new problem for humankind -- instead of grazing pastures, we're now talking about
the modern equivalent: shared and pooled IT resources.

Hence a strong interest in newer forms of friction (or governance) that makes IT production and consumption easier to do, but still preserves and maximizes the value of the shared resource for all.

Thinking About Friction
Friction We're all familiar with the concept of friction -- we see examples every day.  Those of us with an engineering bent tend to see friction as something to be minimized -- it's overhead, it's resistance, it's the quintessential inefficiency.

Even in our personal lives, though, a little friction is a good thing.  For those of you who routinely brave cold winters, friction
becomes important when we're driving or walking outside.

A zero-friction zone is a bad experience waiting to happen.

Which brings up an interesting question -- as we progressively engineer the friction our of our IT environments, how should we think about the "right" places to leave a little friction in place -- at least, until we get more comfortable with the new operating model?

From a purely technological perspective, we can now engineer IT production and consumption environments that have near-zero-friction.  Our immediate IT resource whim can be instantly satisfied, sometimes automagically based on external criteria.

And, when discussing this with IT thinkers, I make the argument that retaining friction in a few key areas is probably a *good* thing.

The Consumption Example
An IT organization stands up their first self-service environment.  Because there is substantial unmet demand, and almost zero friction assumed with consuming, it gets immediately and completely consumed.

Did the "right" workloads end up on the new environment?  Would some of the workloads be better served by a different environment, or consumption model?  What happens when a new (and worthwhile) workload comes along, and the resource is fully allocated?

Or, perhaps a bit more relevant, some well-intentioned but somewhat clueless individual puts up a workload that really shouldn't be there for security or continuity reasons?

Having a combination of realistic policies and human oversight isn't necessarily a bad thing when considering *any* shared resource -- especially at the outset.

The Production Example
Even if you've done a good job of controlling incoming demand along the lines above, removing all friction at the back end creates similar problems.

It's not hard to imagine the erstwhile VMware administrator merrily provisioning virtual machine after virtual machine  until they eventually exhaust some non-server resource such as network or storage or even licenses.  Not that the VMware admin (or whoever) is a bad person; they just have never had to thought about *all* the resources they're consuming, rather than just the stuff they usually work with.
Again, having a combination of realistic policies and human oversight makes sense even for entirely-within-IT consumption models.

But how do you get that "right balance" of optimization without making the whole process burdensome for everyone?

I am not claiming to have the perfect answer for any and all situations, but -- over the last handful of months -- I've picked up some tips and tricks that others are using around these issues.

It's A Journey, Not An Event
More than a few organizations have fell into the trap of designing the "perfect" process to govern the consumption of resources.  Personally, I think this is a fool's errand.

For starters, any process or policy engenders a reaction as people figure out how to use it.  Sometimes those reactions are predictable, very often they're not.  I've seen that it's better to think in terms of an initial approach, and then frequent updates as experience is gained -- often settling out into an equilibrium before too long.

Context changes as well: new requirements, new constraints, etc.  Like the CFO mandating a complete freeze on IT expenditures for the next three months.  Processes and policies that can be quickly changed and communicated are far more useful than ones that can't be.

More to the point, the best policies and processes are built on experience.  The mindset ought to be to find a useful starting point, and continually enhance the approach.

People In The Process Can Add Value
So much focus seems to be put on achieving nirvana: complete and total automation of each and every IT process.  While that's a notable (and completely theoretical) goal, having reasonably smart people in the loop -- armed with efficient processes -- appears to be much more desirable.

A good example might be provisioning a secured application environment.  While it's fine to advertise the capabilities of the secure service, I for one would be interested in having a real, live conversation with anyone who intended to use it.  Having someone in the workflow who contacts the requestor and asks a few questions would be a good thing.

Once the decision was made to go ahead, automating the provisioning and monitoring of the supporting capabilities -- sure!

Chargeback Isn't A Complete Answer
Just because someone has money to spend doesn't mean that they're necessarily 100% informed as to the various tradeoffs between the services.  As service catalogs get progessively richer and easier to consume, there's an associated skill required to be a knoweldgeable and proficient consumer.
I remember trying to figure out which was the right mobile phone plan for my family several years ago.  There was me, my wife, and two teenage kids who really liked to text a lot.  There was the fact that we travel occasionally.

To make matters interesting, the service provider has at least 20 different plans, options and sub-options for me to go thoughtfully consider.

I had money to spend, but I didn't know what the heck I was doing.  A bit of friendly advice would have really helped at that particular moment :)

A Few Practical Examples
My good friends within EMC IT are wrestling with these very issues, and they've done a number of pragmatic things to create a bit of friction in the process while gaining experience on the new consumption dynamics.

For one thing, a fair portion of the self-service cloud ("Cloud 9") has a standard 90-day window.  That means that after 90 days, your stuff automagically goes away.  While not ideal for every use case, that sort of restriction goes a long way to positioning the internal service for transient needs vs. ongoing requirements.  It's highly unlikely that someone's going to put up a sensitive workload on a virtual machine that's only going to exist for three months ...

Another practical example comes from EMC IT's "front desk" or solutions office.  People wanting IT infrastructure call in, and discuss their requirements with a "solutions consultant" who's familiar with the current internally-available service catalog.

Although all services are designed to be potentially consumed in an on-demand manner, they're only available on request from the solution consultants.  Once the decision is made to go, everything is highly automated.

I spoke to one customer who'd done something interesting -- although a bit unusual.  The business people still thought they were buying physical servers and infrastructure -- the processes they'd been using for years were still largely intact.  Except, once within IT, they were carved from a shared virtual pool.  The user-visible management tools showed what looked to be physical resources -- except they weren't.

Rush jobs, changes in specifications, cancelled projects, etc. -- didn't result in any stress from the IT team.  The "friction" in this case belonged entirely to the business -- specifying their requirements, creating justification, getting funding, etc.  The IT guys just built a largely frictionless environment to satisfy physical requests.

Clever.

Organizing For Success
If you look inside of organizations that are seriously doing this stuff, you'll find a different set of organizational constructs.  My best example comes from EMC IT, but I've seen it elsewhere.
As-markets-equilibrium-price_clip_image001 IT services (whether consumed externally to IT, or internally consumed by other parts of IT) are defined and delivered by service owners.

Storage as a service, network as a service, VM as a service, infrastructure as a service, etc. etc. are all owned by individuals that see themselves as capitalists selling to an internal audience.

More advanced services are built by composing underlying services.  Eventually, those services are exposed in such a way that a non-IT person sees them.  But the lines of accountability are clear.

In these models, each service manager is
responsible for balancing between supply and demand.  More importantly, they are the "friction points" (really "market makers") layered over automated delivery mechanisms.  If storage service manager is seeing too much demand for a certain kind of storage service, he/she can change policies and/or internal pricing to bring supply and demand more in balance.

Conversely, if no one wants the storage services being offered, you've got the wrong storage services and perhaps the wrong storage service manager.

As a matter of fact, you can see this concept of a "service owner" at multiple locations in the IT stack -- including service owners who directly face business users.  Is it perfect?  Hardly.  Does it work pretty good where I see it?

Yes.

Back To My Rant?
Yes, cloud concepts are wonderful things.  The promise of delivering a wide range of IT services that are faster, cheaper, more flexible, etc. -- all real and tangible, and everyone wants them.  No question that the industry is moving in that direction, and fast.

That being said, in the process we're creating pools of shared resources that can be consumed on a moment's notice, e.g. a potentially frictionless environment.

Maybe the demand for IT resources is infinite, but supply certainly is not.

And I, for one, would like to see more discussion from the industry clouderati around how to engineer some well-considered friction into customer environments, otherwise ugly things are certainly going to happen.


By: Chuck Hollis