dev2ops on Twitter
Interested in DevOps?
DevOps Toolchain Project
Search dev2ops
Subscribe
Wednesday
Mar282012

Kanban and DevOps Roundtable (Video)

Ok so it's more of a semi-circle than a roundtable... I was at the first ever Kanban for DevOps class this past week in Sunnyvale, CA and after looking around the room I couldn't let these folks go without getting them on video:
- Luke Kanies (Puppet Labs)
- John Willis (Enstratus)
- Gene Kim (Author)
- Dominica DeGrandis (David J. Anderson & Associates)

Lucky for our readers, they didn't disappoint. We talk about why we think Kanban is an excellent tool for solving DevOps flow problems and our Kanban experiences thus far. 

Here is the video:

 

Update: If you are in the Atlanta area, John Willis has started the Atlanta Limited WIP Society!

Sunday
Mar182012

DevOps Lessons from Lean: Small Batches Improve Flow

DevOps problems are fundamentally flow problems. Work doesn't flow properly from one end of the lifecycle (Dev) to the other end of the lifecycle (Ops).

While spirited discussions on tools are a regular occurrence in DevOps circles, there are other simple, yet profound, techniques that have nothing to do with technology but have proven to have a huge impact on improving flow.

Top of that list? Work in small batches.

It seems so simple that it couldn't possibly make that big of difference, but it does. And there is historical precedent for it as well. The principle of working is small batches has proved it's merit in Agile software development and on an even larger stage during the manufacturing revolutions of the 1970s and 1980s.

The reasons why working in small batches has such a strong net positive impact on flow might seem a bit counterintuitive at first. In the absence of relying on "because I told you so", below are the best explanations I could find as to why this works.

 

What is a "batch size"?
A batch is the unit of work that passes from one stage to the next stage in a process. The batch size the scale of that work product.

 

What are the benefits of reducing batch sizes?

Reduces cycle time and gets you quicker feedback - With a small batch size, each batch makes it through the full lifecycle quicker. Since work on a feature isn't complete until it is successfully running in production and getting feedback from users, large batch sizes simply delay that feedback. This means the larger the batch the longer you wait to find out if you did it right. It's easier to make business and technical decisions and easier to recover from a mistake if you are working on shorter time horizons.

Reduces risk of an error or outage - With a small batch size, you are reducing the amount of complexity that has to be dealt with at any one time by the people working on the batch. The reduction in complexity comes not only from the number and size of the moving parts that are touched while working on the batch, but also in the amount of person-to-person communication that needs to happen (due to smaller teams). This is just acknowledging the natural limitations of human beings. The more complexity people have to deal with, the more mistakes there will be. Smaller batch size also leads to quicker feedback, so if there is an error in the batch it will be caught sooner. A small batch size lends itself well to quicker problem detection and resolution (the field of focus in addressing the problem can be contained to the footprint of that small batch and the work that is still fresh in everyone's mind).

Reduces product risk - This builds on the idea of faster feedback. The sooner you can put an individual feature in front of your target audience, the sooner you will know if you've achieved the right product and market fit. The larger the batch size, the greater the product risk when you finally release that batch. Statistics shows us that it's beneficial to decompose a large risk into a series of small risks. For example, bet all of your money on a single coin flip and you have a 50% chance of losing all of your money. Break that bet into 4 smaller bets and it would take 4 sequential bets to result in financial ruin (1 in 16 or 6.25% chance of losing all of your money).

Large batch sizes also often lead to compounding schedule delays and cost overruns. The larger the batch, the more likely it is that a mistake was made in estimating or during the work itself. The chance and potential impact of these mistakes compounds as the batch size grows… increasing the delay in being able to get that all important feedback from the users and increasing your product risk.

Improves efficiency and lowers overhead - Conventional wisdom holds that large batches allow greater productivity (i.e. you get more done with large uninterrupted periods of work) and lower overhead (less batches = less transactional costs). As has been proven in the manufacturing world (Lean) and now software development (Agile), this simply isn't the case. The larger the scope of the batch, the more complexity the individual has to deal with. The complexity of a debug task grows as 2ⁿ when n things are changed in one batch. In knowledge work, the larger the uninterrupted period of work leads to greater change complexity, greater the volume of debug work, and more handoff complexity. That is all added overhead. But even assuming the individual was still being more efficient by working in a large batch, you would still be creating greater inefficiency for the end-to-end process.

For a large batch of changes, especially those made to an even larger system, the handoff to the next step in the process is going to be highly inefficient for the receiving party to deal with (think: Development to Operations "toss if over the wall" handoff of a major release). And if something goes wrong, the time between when the error was introduced and when it will be discovered is so long that it is no longer fresh in the mind of the person who introduced the error. Small batches also have been proven to actually reduce transaction costs because of a curious fact of human nature… people get better at and find ways to increasingly improve the things they are forced to do more often.

Improves management visibility and control - Reducing batch sizes gives you a greater number of instrumentation points by which you can visualize and measure the flow of work through your organization. It's notoriously difficult to accurately determine progress of in-flight work. You are largely going to be limited to the subjective analysis of project managers and the biased opinion of the person doing the work. The only points where you can have certainty is either when the work has just started or when the work has just completed (and accepted by the next step in the process). With large batch sizes you have to wait long periods of time between those start and completion points, making it difficult to see how things are flowing, providing little guarantee that you will have adequate warning if things are going wrong, and allowing for few opportunities to make adjustments to optimize or triage. With small batch sizes you can see work move through the lifecycle with certainty, spot problems early, and make ongoing adjustments to optimize the flow of delivery.

Encourages decoupled architectures with less dependency issues - Smaller batch sizes can also have a positive impact on architecture. Most IT systems are built from within the context of large projects. Large projects create them and then large projects are undertaken to change them. The result is a built-in tolerance for monolithic architectures with complex dependencies. As you move to small batch sizes you are naturally limiting the work in progress on a particular segment of your code/infrastructure. While initially this might seem like it will slow the organization down, the principles of flow show that this will actually give you greater throughput over time. But in order to speed things up even further, you will end up looking for ways to increasingly decouple and isolate (including making fault tolerant) your architecture to allow for greater parallelization of work.

 

What are the economic benefits of reducing batch size?
In manufacturing and in software development, reducing batch sizes has been showen to have a significant impact on the economics of the production process. The diagram below (scanned from Donald G. Reinertsen's "The Principles of Product Development Flow", pg 121) lays out the direct links between smaller batch sizes and improved economics. I think the logic speaks for itself.

 

 

What are your control points for reducing batch sizes?
Reducing batch sizes is a policy decision that needs to be implemented at multiple levels: 

Project Initiation and Funding - How projects are formed and funded tends to have a strong correlation to batch size. The definition of requirements and success criteria, in addition to the allocation of budget, is usually done in a large batch that corresponds to a specific or set of business goals that were created at the quarterly or yearly scale. The inertia of this large batch is often carried throughout the rest of the lifecycle, becoming a pacemaker of sorts that encourages large batch sizes. Positive work done to break down these large initial batches into smaller batches can turn that inertia back into a net positive effect for the company. Reduction in the time horizon for the expected results of a project is usually a good way to force the issue (e.g. try scoping and budgeting projects to single month size rather than quarter/multi-quarter size).

Project management - When creating projects consider what is the smallest amount of change that can be undertaken in the shortest amount of time and still achieve a measurable result. This will naturally lead to smaller teams working on smaller batches of work that can flow independently through the lifecycle with faster feedback and lower risk to the overall system.

Testing - Demand that individual pieces of work are tested as soon as those pieces of work are completed (and not wait for the entire project/release to be code complete). Continuous integration and it's built in unit/smoke tests is a crude example of this principle. Carry that further. Ensure that full deployment and testing efforts are ongoing during any project. This will automatically force engineers to think about their work in small units that can be completed and handed off for testing at regular intervals (naturally creating the urge to reduce batch sizes).

Release management - Break down large releases into small units of deployment that employ standardized packaging and configuration management mechanisms. These units of deployment should be aligned towards the things that are changed (i.e. application services) rather than large project releases that change many things. In addition to reducing deployment and configuration woes, this also has the effect of standardizing batch sizing across lifecycle by determining the appropriate unit of change for your infrastructure.

 

I'm standing the on shoulders of people a lot smarter than me in this post. If you are interested in these ideas please check out:
http://www.amazon.com/Principles-Product-Development-Flow-Generation/dp/1935401009/ref=cm_cr_pr_product_top
http://www.startuplessonslearned.com/2009/02/work-in-small-batches.html http://www.dbrmfg.co.nz/Production%20Batch%20Issues.htm
http://www.informit.com/articles/article.aspx?p=1833567&seqNum=3

Thursday
Mar152012

Kanban for DevOps Class and Meetup

I normally don't announce commercial classes... but this is a special case. On Thursday (3/22) and Friday (3/23) there is going to be a Kanban for DevOps class here in Silicon Valley (Sunnyvale, CA to be precise).

The class is being taught by Dominica Degrandis who has been following the DevOps movement for David J. Anderson and Associates (widely recognized as an authoritative source of Kanban knowledge and training... probably because they wrote the book).

Kanban has been around for a while in the Dev world (and of course for decades in the manufacturing world). Recently, the idea of using Kanban to being Dev and Ops together under a common visualization of flow is picking up steam in the DevOps community. This is going to be the first class of its kind. In addition to Dominica's curriculum, I'm just as excited about the interaction with the other attendees (Gene Kim, Alex Honor, John Willis, etc.). The discount rate is $800 (use the code DEVOPSDELIVER). I have no financial stake in the class. I just think it will be well worth it. If you have questions please direct them towards Dominica at dominica@djaa.com.

But wait there is more....

On the evening of Thursday (3/22), we are going to be having a special eddition of the Silicon Valley DevOps Meetup. Most of the attendees from the Kanban for DevOps class are going to be there. Of course we'll talk a lot about DevOps and Kanban but Gene Kim is also going to be there to discuss the research (and perhaps do some live reading) from his new book "When IT Fails: The Novel". Like all SV DevOps Meetups, this is a free event (and rumor has it that Enstratus is springing for the pizza).

 

Wednesday
Jan252012

Crowbar is quietly getting more interesting (video)

Crowbar is an interesting project that I've covered before. Born out of Dell's cloud group, much of the initial buzz described it as an installer for the cloud era... "kickstart on steroids", if you will.

Crowbar's close association with the OpenStack project has further cemented its reputation as an installer to watch. But's it's Crowbar's quiet potential as a stack management tool that is the most interesting. Through the use of barclamps (Crowbar's modules) you can tell Crowbar to build a full stack from the BIOS config all the way up to your middleware and applications. John Willis on an episode of DevOps Cafe called it "Data Center as Code".

Crowbar barclamps are also an interesting way for independent projects or vendors to ensure that their projects/products can be easily integrated into a custom platform (today this type of focus is usually in the context of making things work on OpenStack). Want to add a new component to your platform? Grab the barclamp and Crowbar will know how to do the rest. Or at least that is the promise. The project is still young and the community is still forming.

Leading open source software projects is new territory for Dell, as a company, but the Crowbar team does seem committed and community focused. I've heard some grumbles from developers that barclamp development and testing cycles can be a bit tedious due to the nature of what you are building. But no reason to believe that those types of issues won't get sorted out over time. 

A couple of Crowbar related videos are below:

The first video was made by my DTO Solutions colleague, Keith Hudgins, after he wrote a barclamp for Zenoss. It's a short demo and tour that can give you a feel for Crowbar and Barclamps.

 

The next video is Barton George (Dell) interviewing Rob Hirshfeld (Dell). They start off talking about the Hadoop barclamp but quickly getting into a broader discussion about Crowbar. 

 

Thursday
Dec292011

Value of DevOps Culture: It's not just hugs and kumbaya 

The importance of culture is a recurring theme in most DevOps discussions. It's often cited as the thing your should start with and the thing you should worry about the most.

But other than the rather obvious idea that it's beneficial for any company to have a culture of trust, communication, and collaboration... can using DevOps thinking to change your culture actually provide a distinct business advantage?

Let's take the example of Continuous Deployment (or it's sibling, Continuous Delivery). This is an operating model that embodies a lot of the ideals that you'll hear about in DevOps circles and is impossible to properly implement if your org suffers from DevOps problems. 

Continuous Deployment is not just a model where companies can release services quicker and more reliably (if you don't understand why that is NOT a paradox, please go read more about Continuous Deployment). Whether or not you think it could work for your organization, Continuous Deployment is a model that has been proven to unleash the creative and inventive potential of other organizations. Because of this, Continuous Deployment is a good proxy for examining the effects of solving DevOps problems.

Eric Ries sums it up better than I can when he describes the transformative effect that takes place the further you can reduce the cost, friction, and time between releases (i.e. tests to see if you can better please the customer).

"When you have only one test, you don’t have entrepreneurs, you have politicians, because you have to sell. Out of a hundred good ideas, you’ve got to sell your idea. So you build up a society of politicians and salespeople. When you have five hundred tests you’re running, then everybody’s ideas can run. And then you create entrepreneurs who run and learn and can retest and relearn as opposed to a society of politicians."

-Eric Ries
  The Lean Startup (pg. 33) 

 
That's a business advantage. That's value derived from a DevOps-style change in culture.