dev2ops on Twitter
Interested in DevOps?
DevOps Toolchain Project
Search dev2ops
Subscribe
Wednesday
Jan252012

Crowbar is quietly getting more interesting (video)

Crowbar is an interesting project that I've covered before. Born out of Dell's cloud group, much of the initial buzz described it as an installer for the cloud era... "kickstart on steroids", if you will.

Crowbar's close association with the OpenStack project has further cemented its reputation as an installer to watch. But's it's Crowbar's quiet potential as a stack management tool that is the most interesting. Through the use of barclamps (Crowbar's modules) you can tell Crowbar to build a full stack from the BIOS config all the way up to your middleware and applications. John Willis on an episode of DevOps Cafe called it "Data Center as Code".

Crowbar barclamps are also an interesting way for independent projects or vendors to ensure that their projects/products can be easily integrated into a custom platform (today this type of focus is usually in the context of making things work on OpenStack). Want to add a new component to your platform? Grab the barclamp and Crowbar will know how to do the rest. Or at least that is the promise. The project is still young and the community is still forming.

Leading open source software projects is new territory for Dell, as a company, but the Crowbar team does seem committed and community focused. I've heard some grumbles from developers that barclamp development and testing cycles can be a bit tedious due to the nature of what you are building. But no reason to believe that those types of issues won't get sorted out over time. 

A couple of Crowbar related videos are below:

The first video was made by my DTO Solutions colleague, Keith Hudgins, after he wrote a barclamp for Zenoss. It's a short demo and tour that can give you a feel for Crowbar and Barclamps.

 

The next video is Barton George (Dell) interviewing Rob Hirshfeld (Dell). They start off talking about the Hadoop barclamp but quickly getting into a broader discussion about Crowbar. 

 

Thursday
Dec292011

Value of DevOps Culture: It's not just hugs and kumbaya 

The importance of culture is a recurring theme in most DevOps discussions. It's often cited as the thing your should start with and the thing you should worry about the most.

But other than the rather obvious idea that it's beneficial for any company to have a culture of trust, communication, and collaboration... can using DevOps thinking to change your culture actually provide a distinct business advantage?

Let's take the example of Continuous Deployment (or it's sibling, Continuous Delivery). This is an operating model that embodies a lot of the ideals that you'll hear about in DevOps circles and is impossible to properly implement if your org suffers from DevOps problems. 

Continuous Deployment is not just a model where companies can release services quicker and more reliably (if you don't understand why that is NOT a paradox, please go read more about Continuous Deployment). Whether or not you think it could work for your organization, Continuous Deployment is a model that has been proven to unleash the creative and inventive potential of other organizations. Because of this, Continuous Deployment is a good proxy for examining the effects of solving DevOps problems.

Eric Ries sums it up better than I can when he describes the transformative effect that takes place the further you can reduce the cost, friction, and time between releases (i.e. tests to see if you can better please the customer).

"When you have only one test, you don’t have entrepreneurs, you have politicians, because you have to sell. Out of a hundred good ideas, you’ve got to sell your idea. So you build up a society of politicians and salespeople. When you have five hundred tests you’re running, then everybody’s ideas can run. And then you create entrepreneurs who run and learn and can retest and relearn as opposed to a society of politicians."

-Eric Ries
  The Lean Startup (pg. 33) 

 
That's a business advantage. That's value derived from a DevOps-style change in culture.

 

 

Monday
Dec052011

Companies have plenty of monitoring, what they don’t have is control

I was honored to be asked to speak at DevOps Days in Manila and just got off stage. I was blown away when I found out over 400 people signed up to attend. Speaking gives me a chance to unload a bunch of baggage I’ve been carrying around years.

We all bring a lot of baggage with us into a job. The older you are, the more you bring. The first part of my career I did 10 years of real-time industrial control software design, implementation, and integration way way back before the web 1.0 days. Yes, I wrote the software for the furniture Homer Simpson sat in front of at the nuclear plant that was all sticky with donut crumbs...

I took that manufacturing background baggage to E*TRADE in ’96 where I ran into fellow dev2ops contributor Alex Honor who brought his Aimes Research Laboratory baggage of (at the time) massive compute infrastructure and mobile agents. We used to drink a bunch of coffee and try to figure out how this whole internet e-commerce thing needed to be put together. We’d get up crazy early at 4:30AM, listen to Miles, and watch the booming online world wake up and trade stocks and by 9:00AM have a game plan formulated to make it better.

My manufacturing background was always kicking in at those times looking for control points. Webserver hits per second, firewall MBits/sec, Auth success or fail per second, trades per second, quotes per second, service queue depths, and the dreaded position request response time. I was quite sure there was a correlation algorithm between these phenomena and I could figure it out if I had a few weeks that I didn’t have. I also knew that once I figured it out, the underlying hardware, software, network, and user demand would change radically throwing my math off. Controlling physical phenomena like oil, paper, and pharmaceutical products followed the math of physics. We didn’t have the math to predict operating system thread/process starvation and it took us years to figure out OS context switches per second has a huge kernel scaleability issue not often measured or written about.

One particularly busy morning in late ’96 Alex was watching our webserver, pointed at a measurement on the screen and said, “I think we’re gonna need another webserver”. With that, we also needed to figure out how to loadbalance webservers. As usual for the era, two webservers was a massive understatement. Within a year, there was more compute infrastructure at E*TRADE supporting the HTTPS web pages then the rest of the trading system and the trading system had been in place for 12 years by this time... Analytics of measurements (accompanied by jazz music) became an important part of our decision making.

 

Alex and I were also convinced in early ’97 that sound manufacturing principles used in the physical world made a ton of sense to apply to virtual online world of the internet. I’m still surprised the big control systems vendors like Honeywell and Emerson haven’t gotten into data center control. No matter, the DevOps community can make progress on it as its so complimentary to DevOps goals and its what the devops-toolchain project is all about.

Get a bunch of DevOps folks together and the topic of monitoring comes up every time. I always have to ask “Are you happy with it?” and the answer is always “no” (though I don’t think anyone at Etsy was there). When you drill into what’s wrong with their monitoring, you may find that most companies have plenty of monitoring, what they don’t have is control.

Say your app in production runs 100 logins/sec and you are getting nominally 3 username/password failures a second. While the load may go up and down, you learn that that the 3% ratio is nominal and in control. If the ratio increments higher, that may be emblematic of a script kiddie running a dictionary attack or the password hash database is offline or a application change making it harder for users to properly input their credentials. If it drops down, that may indicate a professional psyber criminal is running an automated attack and getting through the wire. Truman may or may not of said “if you want a new idea, read an old book”. In this case, you should be reading about “Statistical Process Control” or SPC. It was heavily used during WWII. With our login example, the ratio of success to failed login attempts would be “Control Charted” and the control chart would evaluate weather the control point was “in control” or “out of control” based on defined criteria like standard deviation thresholds.

Measurement itself is a very low level construct providing the raw material for the control goal. You have to go through several more toolchain layers before you get to the automation you are looking for. We hit upon this concept in our talk at Velocity in 2010...

Manufacturing has come a long long way since WWII. Toyota built significantly on SPC methodologies that eventually became the development of “Lean Manufacturing”; a big part of the reason Toyota became the worlds largest automobile manufacturer in 2008. A key part of lean is Value Stream Mapping which is “used to analyze and design the flow of materials and information required to bring a product or service to a consumer” (wikipedia).

Value Stream Mapping a typical online business through marketing, product, development, qa, and operations flows minimally will help effectively communicate rolls, responsibilities, and work flows through your org. More typically it becomes a tool to get to a “future state” which has eliminated waste and increase effectiveness of the org, even when nothing physical was “manufactured”. I find agile development, devops, and continuous deployment goals all support lean manufacturing thinking. My personal take is that ITIL has similar goals, but is more of process over people approach instead of a people over process approach and it’s utility will be dependent on the organizations management structure and culture. I prefer people over process, but I do reference ITIL every time I find a rough or wasteful organizational process for ideas on recommending a future state.

I was lucky enough to catch up with Alex, Anthony, and Damon over dinner and we were talking big about DevOps and Lean. Anthony mentioned that “we use value stream mapping in all of our DevOps engagements to make sure we are solving the right problem”. That really floored me on a few levels. First off, it takes Alex’s DevOps Design Patterns and DevOps Anti-Patterns to the next level similar to SPC to Lean adding a formalism to the DevOps implementation approach. It also adds a self correcting aspect to a companies investment into DevOps optimizations. I’ve spoken with many companies who made huge investments in converting to Agile development without any measurable uptick in product deployment rates. While these orgs haven’t reverted back to a waterfall approach as they like the iterative and collaborative approach, they hit the DevOps gap head on.

“We use Value-Stream Mapping in all of our DevOps engagements to make sure we are solving the right problem”
                                                 -Anthony Shortland (DTO Solutions)

Practicers of Lean Manufacturing see this all the time. Eliminating one bottleneck just flows downstream to the next bottleneck. To expect greater production rates, you have to look at the value stream in its entirety. If developers were producing motors instead of software functions, a value stream manager would see huge inventory build up of the motors which produce no value to the customer and identify the overproduction as waste. Development is a big part of the value stream and making that more efficient is a really good idea. But a measurement of the release backlog growing is seldom measured or managed. If you treat your business as a Digital Information Manufacturing plant and manage it appropriately to that goal, you can avoid the frequent mistake Anthony and other Lean practitioners are talking about where you solve a huge problem without benefiting the business or the customer.

To sum up, DevOps inspired technology can learn quite a bit from Lean Manufacturing and Value Stream Mapping. This DevOps stuff is really hard and you’ll need to leverage as much as possible. Always remember that “Good programmers are lazy” and its good when you apply established tools and techniques. If you don’t think your working in a Digital Information Manufacturing plant, I bet your CEO does.

Tuesday
Sep272011

Video: Marten Mickos and Rich Wolski talk DevOps and Private Clouds

I ran into Marten Mickos and Rich Wolski from Eucalyptus Systems at PuppetConf and got them to sit down for a quick video alongside my fellow dev2ops.org and DevOps Cafe contributor, John Willis.

I had just come out of Marten's keynote where he spoke about DevOps far more than I would have expected. In this video we explore the deep connection between DevOps and Private Clouds as well as other industry changes for which they are planning.

Eucalyptus was one of the first private cloud technologies on the scene, and consequently got the benefit and burden of being the early mover. The community had some ups and downs along the way, but their product and industry vision seems encouraging and warrants a closer look (and never count out Marten Mickos in an open source software battle).

Monday
Sep262011

Puppet and Chef Rock. Doh. What about all these shell scripts ?! 

Incorporating a next generation CM tool like Puppet or Chef into your application or system operations is a great way to throw control around your key administrative processes.

Of course, to make the move to a new CM tool, you need to adapt your current processes into the paradigm defined by the new CM tool. There is an upfront cost to retool (and sometimes to rethink) but later on the rewards will come in the form of great time savings and consistency. 

Seems like an easy argument. Why can't everybody just start working that way? 

If you are in a startup or a greenfield environment, it is just as simple as deciding to work that way and then individually learning some new skills.

chess
In an enterprise or legacy environment, it is not so simple. A lot of things can get in the way and the difficulty becomes apparent when you consider that you are asking an organization to make some pretty big changes:
  • It's something new: It's a new tool and a new process.
  • It changes the way people work: There's a new methodology on how one manages change through a CM process and how teams will work together.
  • Skill base not there yet: The CM model and implementation languages needs to be institutionalized across the organization.
  • It's a strategic technology choice: To pick a CM tool or not to pick a CM tool isn't just which one you choose (eg, puppet vs chef). It's about committing to a new way of working and designing how infrastructure and operations are managed.
Moving to a next generation CM tool like Chef or Puppet is big decision and in organizations already at scale it usually can't be done whole hog in one mammoth step. I've seen all too often where organizations realize that the move to CM is a more complicated task than they thought and subsequently procrastinate.

So what are some blocking and tackling moves you can use to make progress?

Begin by asking the question, how are these activities being done right now?
I bet you'll find that most activities are handled by shell scripts of various sorts: old ones, well written ones, hokey rickety hairballs, true works of art. You'll see a huge continuum of quality and style. You'll also find lots of people very comfortable creating automation using shell scripts. Many of those people have built comfortable careers on those skills.

tshirt
This brings me to the next question, how do you get these people involved in your movement to drive CM? Ultimately, it is these people that will own and manage a CM-based environment so you need their participation. It might be obvious by this point but I think someone should consider how they can incorporate the work of the script writers. How long will it take to build up expertise for a new solution anyway? How can one bridge between the old and new paradigms?

The pragmatic answer is to start with what got you there. Start with the scripts but figure out a way to cleanly plug them in to a CM management paradigm. Plan for the two styles of automation (procedural scripting vs CM). Big enterprises can't throw out all the old and bring in the new in one shot. From political, project management, education, and technology points of view, it's got to be staged.

To facilitate this pragmatic move towards full CM, script writers need:
  • A clean consistent interface. Make integration easy.
  • Modularity so new stuff can be swapped/plugged in later.
  • Familiar environment. It must be nice for shell scripters
  • Easy distribution. Make it easy for a shell scripter to hand off a tool for a CM user (or anybody else for that matter)
Having these capabilities drives the early collaboration that is critical to the success of later CM projects. From the shell scripter's point of view, these capabilities put some sanity, convention and a bit of a framework around how scripting is done. 

I know this mismatch between the old shell script way and the new CM way all too well. I've had to tackle this problem in several large enterprises. After a while, a solution pattern emerged. 

Since I think this is an important problem that the DevOps community needs to address, I created a GitHub project to document the pattern and provide a reference implementation. The project is called rerun. It's extremely simple but I think it drives home the point. I'm looking forward to the feedback and hearing from others who have found themselves in similar situations.

rerun
For more explanation of the ideas behind this, see the "Why rerun?" page.