dev2ops on Twitter
DevOps Toolchain Project
Interested in DevOps?
Search dev2ops
Subscribe
Monday
Apr062009

The real reason why enterprises aren't moving to the clouds

Visit any of the cloud obsessed blogs, discussion forums, or conferences and you'll hear the same "reasons" as to why cloud computing isn't catching on within enterprise IT shops. It's always something about interoperability, service level agreements, security risks, data formats, APIs, or hypothetical legal implications.

Interesting issues, but they are all red herrings.

The inability of enterprises to take advantage of the clouds isn't due to the shortcomings of the cloud offerings available today. The shortcoming is with the state of today's enterprise IT shops. The real reason is that those millions of applications currently running within enterprises are hardwired into a particular environment.

I'm not talking about the application code itself. After all, Linux is Linux and Windows is Windows no matter if it's running on native hardware, a local VM, or somewhere in a cloud. The true problem is with the way today's applications are configured, deployed, and managed. Very few folks in enterprise IT are willing to admit to the hairballs that decades of shortsighted IT Management techniques have created.

The often offered up excuse of there being "a lack of cloud skills within the enterprise" is really just code for a general abundance of poor to outrageously awful IT Management techniques.

Let's look at a basic and often quoted use case for cloud computing.

With server utilization rates in the 3-15% range, there is obviously room for significant reduction in capital expenditures by taking advantage of cloud-based elastic computing resources. Why isn't there a rush of enterprises running to take advantage of this? No, the answer isn't the often quoted fear of vendor lock-in. The answer is that these enterprises are locked into themselves.

Think of how difficult it is for an enterprise to switch datacenters. Months of effort go into planning and executing the move, but have you ever heard of one going smoothly? If it takes months to move between your own datacenters and you still can't get it right, what hope do you have for making it into the clouds?

It's no coincidence that the local virtualization vendors like VMware and Parallels are facing this same denial of such an obvious business case as they attempt to push their offerings out of development and testing environments and into production environments.

Rob England (TheITSkeptic) is one of the few pundits to point out the fundamental disconnect between enterprise IT and the clouds. He attributes the largest hurdle to migration costs. I would take it a couple of steps further. Migration is just a symptom of the problem. If migration was the only problem, using cloud infrastructure would be a slam dunk for new application projects. But, of course, the same old broken management techniques ingrained in an enterprise plague new and old projects alike.

The bottom line is that what passes for the status quo in IT Management is crippling enterprises. Enterprise IT can't take full advantage of such fundamental advances as virtualization and elastic computing until an "abstracted administration" paradigm becomes standard operating procedure.

Abstracted administration means the ability to work from a point of view that is independent of any particular server instances or specific software deployments. Within the abstracted administration paradigm, an administrator manages deployment and ongoing operations from a higher level and lets the underlying framework coordinate operations across the actual physical environment. Once you've achieved abstracted administration, moving datacenters or re-deploying an application to virtualized servers in the cloud is as simple as updating one part of the specification that drives the abstraction. Your tools will then handle the rest.

Of course, achieving abstracted administration means that the provisioning and management of your entire application stack -- from OS install to running integrated application services -- must be fully automated using tools that support this specification-driven, abstracted administration paradigm.

If you look at what passes for state of the art in many IT shops, it might seem like the ability to achieve abstracted administration and fully automated provisioning is a long ways off.

That is simply not the case. The tools to get this done are already here, they work well, and they are all open source. Below is a diagram of an open source toolchain that can provide fully automated provisioning.

Still not convinced it can be done? Check out this whitepaper. The paper lays out how fully automated provisioning can be (and has been) achieved using these standard open source tools.

If you are interested in more detailed explanation of the abstracted administration paradigm, check out this detailed post by Alex on the ControlTier Blog.

Sunday
Mar222009

Web Operations: the canary in the IT Management coal mine

Rob England (The IT Skeptic), recently wrote some very nice things about this blog.

After I got over the fact that one of my favorite bloggers is writing about this blog, I realized that his post does raise a good question: If good IT Management is good IT Management not matter what business you are in, why does this blog focus so much on the Web Operations perspective?

Part of the reason is that Web Operations is the world that Alex and I live in on a daily basis (via ControlTier... helping e-commerce and SaaS companies improve the efficiency and reliability of their operations).

The other part of the reason is that we see Web Operations as the canary in the coal mine for IT Operations. When a company's entire business is operating software as revenue producing service, the shortcomings and the successes of your IT Operations goes right to your bottom line. The tolerance for the status quo dissipates a lot quicker and there is stronger political will to think outside of the box.

Put it this way, pretend you're the CEO of a Fortune 100 size company that makes aircraft engines or automobiles. Where is improving the efficiency and reliability of your IT Operations going to fall on the list of things you worry about every day? 32 on a top 50 list might be generous.

Now pretend you are the CEO of an online company whose sole source of revenue comes from what you can generate through your website. Suddenly the efficiency and reliability of your IT Operations jumps to near the top of the list.

Update: While people point out to me that I'm stretching the "canary in a coal mine" metaphor a bit far... I'm loading The Police's Zenyatta Mondatta album into my iTunes.

Monday
Mar022009

Web Operations: Are you developing an asset or a liability?

"Buy vs. Build". It's a term you hear repeatedly with it comes to businesses weighing their options for application and systems management solutions. But as anyone who spends time in the web operations trenches knows, the reality is always something closer to "build vs. build". Buy something from a software vendor, use open source tools, develop something from scratch - in each situation there just isn't a one size fits all option and there is always going to be custom integration involved. This reality was previously covered in Alex's "Stone Axes" post.

So being resigned to the fact that there is a "build" aspect to any solution, the next critical choice then becomes what guidelines you impose on your organization to steer their design choices. The most pervasive design criteria seems to be technical completeness or elegance. From a technical architect's purist point of view this makes sense; but what this often fails to take into account is the business impact of those technical decisions.

While many technical design options might seem to have identical business impact on day 1 (they cost roughly x to develop and provide feature y), what are the true cost of those decisions down the road? Have those decisions put the company in a position to continuously leverage those design choices into increasingly greater returns? Or have those decisions placed an anchor around the company's neck that they will be weighted down by, and paying for, well into the future? To put it into loose economic terms: have you developed an asset or a liability for your company?

What would be an example of building asset? Using off the shelf open source tools and only developing thin layers of integration where they need to plug into your existing systems.

What would be an example of building a liability? Writing a custom system that mirrors the available functionality of existing off the shelf tools, thereby saddling your company with the sole responsibility for the forward progress of the design and maintenance of that tooling.

The asset vs. liability concept is one that obviously needs to be flushed out quite a bit more. In any case, it's shocking how infrequently companies actually analyze the long-term business impact of the technical design decisions made about their tooling.

(Note: Thanks to Lee Thompson for framing this as an asset vs liability debate)

Friday
Dec192008

Checklists: the most unsexy way to save millions

The New Yorker has a great article on the success of using checklists to tame extremely complex systems.

The primary example used in the article is intensive care units in hospitals. Anywhere you see the term "intensive care" substitute "data center" and anywhere you see a name of a medical procedure substitute the name of a technical procedure and the lessons are essentially the same.

What are the lessons?

1. Where checklists have been formalized and rigidly enforced (as a means of documenting and enforcing best practices), millions of dollars have been saved and many deaths (the ultimate "system outage") have been avoided.

2. The concept of checklists is so simple and unsexy that their awesome saving power is often overlooked. Admit it, your inner geek yawns just thinking about checklists.

How can checklists immediately improve IT operations?

First, agree on your best practices and document them. Second, strictly enforce the rule that all operations activities must follow those procedures. Third, record the completion of each step of the procedure for trouble shooting and analysis.

Sounds like such common sense, doesn't it? If it is then why do most IT operations fail at implementing such a simple culture of orderly change management?

Monday
Oct272008

Book Review: The Visible Ops Handbook

Operational excellence in IT always seems to be an illusive goal. Attempts you'll see will often range from the "magic bullet technology" projects that rarely deliver on expectations to the addition of crushing bureaucracy that is quickly circumvented and rendered ineffectual.

With these thoughts in mind, I was leery when I picked up a copy of The Visual Ops Handbook. Wow, was I ever surprised. The Visible Ops Handbook is a compact and highly effective prescription for achieving operational excellency. It won't get you all the way to the promised land but it will send you down the path on solid footing.

The approach is not about implementing new technology. It's not about ivory tower bureaucracy. The Visible Ops Handbook is about bringing reliability, accountability, and predictability to your operations through a commonsense based process that doesn't require heroic discipline or unrealistic political capital to implement.

Who should buy this book? The short answer is "everyone". For a longer answer I'll borrow a passage from the book's introduction:

  • Organizations that have change management processes, but view these processes as overly bureaucratic and diminishing of productivity. There must be more to change management than bureaucracy, good intentions and scarcely attended meetings.

  • Organizations where, deep down, everyone knows that people circumvent proper processes because crippling outages, finger-pointing, and phantom changes run rampant.

  • A "cowboy culture" where seemingly "nimble" behavior has promoted destructive side effects. The sense of agility is all too often a delusion.

  • A "pager culture" where IT operations believes that true control simply is not possible, and that they are doomed to an endless cycle of break/fix triggered by a pager message at late hours of the night.

  • An environment where IT operations and security are constantly in reactive mode with little ability to figure out how to free themselves from fire fighting long enough to invest in any proactive work.

  • Organizations where both internal and external auditors are on a crusade to find out whether proper controls exist and to push madly for implementing new ones where they are not in place.

  • Organizations where IT understands the need for controls, but does not know which controls are needed first.

Yes, they are talking about you.

It's a short read (100 pages including several appendices), so buy one for everyone in your department. Available in paperback from Amazon or as a PDF from itpi.org.

*Note: the full title is "The Visible Ops Handbook: Implementing ITIL in 4 Practical and Auditable Steps". In my opinion the fact that ITIL is in the title is a bit misleading. There are some sidebar discussions that draw connections between the Visible Ops process and ITIL, but this is a book about how to succeed in operations first and foremost. I suspect the ITIL connection was made for marketing reasons. Don't let it taint your opinion before you read the book.