dev2ops on Twitter
Interested in DevOps?
DevOps Toolchain Project
Search dev2ops
Subscribe
Monday
Jan072008

"What does Bob want?" - an amusing lesson about figuring out what actually matters to the business

This is a great episode of Redmonk's People Over Process podcast:
http://redmonk.com/cote/2007/11/02/open-source-in-it-management-with-john-willis-redmonk-radio-44/

For anyone interested in systems management or automating operations this one is not to be missed. The interview is with John Willis (master independent Tivoli consultant) on the state of the enterprise systems management world.

The most impressive part is John's retelling of his conference favorite "What does Bob want?" story. This modern business fable (based on a true story) really should strike a nerve in anyone who has been involved in systems management implementations. We've all heard terms like "business and IT alignment".. but how often does it really happen? What may seem like a success to the guys in the trenches will seem like a letdown (at best) or failure (at worst) to the business leader who signed the check.

Or as the interviewer, Coté from Redmonk, puts it:
This is about "understanding what it is that the company wants to accomplish with the software, not just making the software do what it does"

Thursday
Jan032008

Teaming with the Open Management Consortium on a Software Operations Design Pattern Repository

After Alex's post yesterday on the need for design patterns, he contacted the Open Management Consortium (OMC) about setting up Design Pattern Repository specifically for those who are creating Operations solutions.

Whurley (fearless leader of the OMC) liked the idea:

"Well, as you all know this is exactly how we want the OMC to operate; community lead. So we have created a new workspace under the "Open Standards" section of the website called "OMC Design Patterns". Thanks to ahonor for the idea and for volunteering to kick things off and help manage the workspace. You can link directly to the workspace (from your blog or other sites) using the following URL:

http://beta.openmanagement.org/community/open_standards/omc_design_patterns

It will be very interesting to see how much adoption this idea picks up. I for one will be participating heavily in the workspace as ahonor has a great idea/perspective that I hope others join in support of."

Be sure to subscribe to that section of the OMC site and join in the discussion.

Wednesday
Jan022008

Where are the design patterns for software operations?

In the world of software development, application developers are accustomed to drawing from the wealth of design patterns that address common programming problems, codify best practices, and establish proven reusable solutions. There are several well known design pattern repositories that catalog solutions into various categories from fundamental ones described by the GangOfFour to architecture specific ones like J2EE Patterns, even ones for social organization. An Anti-pattern is a pattern that tells how to go from a problem to a bad solution. Design patterns help avoid re-inventing solutions and when combined together can form the basis of a problem solving "play book." When used effectively, design patterns become a common problem solving language and can lead to better written software.

But what happens after the code is written? For most organizations today, software operations - the acts of deploying, configuring, and operating software (and all of its related code and data artifacts) - is arguably as important as writing the software itself. If such an organization can't efficiently and reliably operate the software, the quality of the software will not matter. But if one looks for design patterns that codify best practices for automating software operations, nothing turns up. Where is the catalog of design patterns that address the problems encountered when managing environments of software deployments and the overall life cycle of the business service?

Anyone that has managed software operations for different organizations, will recognize the same kinds of problems and will often re-invent solutions that were successful in the past. Others that work closer to the bleeding edge will encounter problems that other groups will face later. If these problems could be discussed in terms of design patterns (or failures as anti-patterns), solutions and best practices for managing software operations would be more consistent across organizations.

Here are two specific problem areas that everyone can identify with:

  • Packages: Depending on the application and infrastructure, one will find multiple package formats in use. Operating systems use their own (eg, .rpm, .deb, .pkg, .msi, etc) and so do software runtime environments (eg, java, .net). Each format has its own way (to greater or lesser extents) of being created, extracted, and described (including dependencies). These differences lead to multiple package silos and administrative gray areas (cumbersome handoffs between dev and admin groups). It would be preferable to have a common repository that can host any kind of package type, and a homogeneous interface to controlling their life cycle (creation, installation and removal).

  • Services: At a certain level, one can view applications as a set of interacting long running processes. Again, depending on the application architecture, these processes might be standalone unix-style daemons, or windows services. Each service has its own way of being started or stopped, as well as a procedure for checking its current runtime state. Often times, shutting down a service is not a simple matter of just invoking a single command. Things go wrong at shutdown requiring other logic to figure out the next course of action. Besides coping with these differences, the deployment process is also difficult because change of runtime state and software package installation is intertwined. Software operations would benefit from a body of design patterns that described proven strategies to managing runtime state and a common model for describing these states.

Here is a sampling of general recurring problems in the world of software operations:

  • Complex application deployments: Applications are based on technologies from different vendors, are spread out over numerous machines in multiple environments, and use different architectures

  • Inconsistent management interfaces: Every application component and supporting piece of infrastrucure has a different way of being managed. This includes both how components are controlled and how they are configured.

  • Hard to scale administrative management: As the layers of software components increase, so does the difficulty to coordinate actions across them. This is especially difficult when the same application can be setup to run in a minimal footprint while another can be designed to support massive load and redundancy.

  • Incoherent life cycles: Applications are typically multi-tiered, where each tier may be on its own development track, uses its own release paradigm and requisite tools.

  • Generally, these problems are found in combination which means coping with them on the whole is a difficult challenge.


What's needed: Domain specific patterns for software operations

 

The body of existing design patterns can and should be used to analyze and solve some of the above problems. To make the design patterns more readily useful to software operations, we need a set of domain specific patterns. These patterns would be expressed in terms of concepts familiar to software operations groups (eg, package, service, process, node, etc) and would be geared to coping with typical problems they face (eg, various startup, shutdown strategies for services among many others). Ideally, these patterns can be composed into a system of patterns that help solve larger scale problems.

Developing patterns is a bit of an organic process but the most durable patterns are ones that have been proven over and over again in different contexts. The first step is to establish a repository to which various patterns can be contributed and a supporting forum where their merits can be discussed. Ultimately, the software operations community will find consensus about some of these patterns, thus establishing some common vocabulary and a basis for framework development.

External links:
PortlandPatternRepository
Hillside

Monday
Dec032007

Operations Above the Level of a Single Device

Tim O'Reilly has another excellent post about the concept of "software above the level of a single device". While most of his post is dedicated to famous consumer-oriented client/services offerings like itunes, if you glance around our industry it's quickly apparent that the concept of "software above the level of a single device" is the predominant vision amongst leading business strategists and software architects. But once that thinking is translated into actual code, the care and feeding of those innovative services tends to look a like like decades old box-by-box thinking. Lots of hand-editing of configs, lots of manually maintained procedural scripts, lots of jumping from log file to log file looking to divine the root cause of a problem or outage, lots of pain and inefficiency everywhere.

As the elegance and distributed nature of business ideas and their resulting software has improved, the the prevalent thinking about how you support it surprisingly has not. Sure there are projects like ControlTier, SmartFrog, Puppet, CFEngine, etc. that deliver on the promise of a better way, but the collective thinking of the industry has a long ways to go before these kinds of solutions become standard. Maybe we need a catchy tagline to close that gap.... "Operations Above the Level of a Single Device" anyone?

Thursday
Nov152007

Sun and IBM virtualization announcements are more than just a move on VMware

Both Sun (launched with Oracle's PR help) and IBM have been making big waves in the virtualization space. On the surface these are clear moves on VMware/EMC, the current collector of the the vast majority of revenue around virtualization. But under the surface there are some other interesting things going.

First, does anyone else agree that this is a way to give the "on-premisis" ecosystem (including hardware sales, services groups, and partners) a way to fight back against the scalability and manageability promises of SaaS alternatives? It's a safe bet that some form of a "do-it-youself SaaS is good" message will be hoisted upon the industry in the upcoming year.

Second, both initiatives talk a lot about "data center management", which is generally code for deployment automation and monitoring. I'm glad to see increasing recognition of this problem space. But I am surprised that very little has come down the pipe in terms of new development on true deployment or configuration management tooling. The default answer still seems to be to let Global Services (or one of Sun's services partners) come in and build a custom solution from a grab bag of tools.

Of course their are open source tools like ControlTier (service provisioning targeted at developers and deployment engineers ) and Puppet (system provisioning targeted at systems administrators), but you hear little from the industry titans about the need for these types of tools. Follow the money for the answer why.