How long to retain build output?

Dit artikel is afkomstig van een externe website.

Martin Fowler has recently made a post on the topic of the importance of reproducible builds. This is a vital principle for any process of continuous integration. The ability to recreate any given version of your system is essential, but there are several routes to it if you follow a process of Continuous Delivery (CD).

Depending on the nature of your application reproducibility will generally involve significantly more than only source code. So in the achievement of the ability to step-back in time to the precise change-set that constituted a particular release version of your software, the source code, while significant, is just a fragment of what you need to consider.

Martin outlines some of the important benefits of the ability to accurately, even precisely, reproduce any given release. When it comes to CD there is another. The ability to reproduce a build pushes you in the direction of deployment flexibility. By the time a given release candidate arrives in production it will have been deployed many times in other environments and for CD to make sense, these preceding deployments will be as close as possible to the deployment into production.

In order to achieve these benefits we must then be able to recover more than just the build, we must be able to reconstitute the environment in which that version of the code that your development team created ran. If I want to run a version of my application from a few months ago, I will almost certainly have changed the data-schemas that underly the storage that I am using. The configuration of my application, application server or messaging system may well have changed too.

In that time I have probably upgraded my operating system version, the version of my web server or the version of Java that we are running too. If we genuinely need to recreate the system that we were running a few of months ago all of these attributes may be relevant.

Jez and I describe approaches and mechanisms to achieve this in our book. An essential attribute of the ability of having a reproducible build is to have a single identifier for a release that identifies all of things that represent the release, the code, the configuration, 3rd party dependency versions, even the underlying operating system.

There are many routes to this, but fundamentally they all depend on all of these pieces of the system being held in some form of versioned storage and all related together by a single key. In Continuous Delivery it makes an enormous amount of sense to to use a build number to relate all of these things together.

The important part of this, in the context of reproducible builds, is that talking about the binary vs the source is less the issue than the scope of the reproduction that you need. If you are building an application that runs in on an end-users system, perhaps within a variety of versions of supporting operating environments, then just recreating the output of your commit build maybe enough. However if you are building a large-scale system, composed of many moving parts, then it is likely that the versions of third-party components of your system maybe important to it’s operation. In this instance you must be able to reproduce the whole works if you want to validate a bug and so rebuilding from source is not enough. You may need to be able to rebuild from source, but you will also need to recover the versions of the web-server, java, database, schema, configuration and so on.

Unless your system is simple enough to be able to store everything in source code control, you will have to have some alternative versioned storage. In our book we describe this as the artifact repository. Depending on the complexity of your system this may be a simple single store or a distributed collection of stores linked together through by the relationships between the keys that represent each versioned artefact. Of course the release candidate’s id sits at the root of these relationships so that for any given release candidate we can be definitive about the version of any other dependency.

Whatever the mechanism, if you want genuinely reproducible builds it is vital that the relationships between the important components of your system is stored somewhere and this somewhere should be along with the source code. So your committed code should include some kind of map for ANY system components that your software depends upon. This map is then used by your automated deployment tools to completely reproduce the state of the operating environment for that particular build. Perhaps by retrieving virtual machine images from some versioned storage, or perhaps running some scripts to rebuild those systems to the appropriate starting state.

Because in CD we retain these, usually 3rd party, binary dependencies, and must do so if we want to reproduce a given version of the system, then in most cases we recreate versions from binaries of our code as well as those dependencies because it is quicker and more efficient. On my current project we have never, in more than 3 years, rebuilt a release candidate from source code. However, storing complete, deployable instances of the application can take a lot of storage and while storage is cheap it isn’t free.

So how long is it sensible to retain complete deployable instances of your system? In CD each instance is referred to as a “release candidate” each release candidate has status associated with it indicating that candidate’s progress through the deployment pipeline. The length of time that it makes sense to hold onto any given candidate depends on that status.

Candidates with a status of “committed” are only interesting for a relatively short period. At LMAX we purge committed release candidates that have not been acceptance tested, those that have been skipped-over because a newer candidate was available when the acceptance test stage ran or those that failed acceptance testing. Actually we dump any candidate that fails any stage in the deployment pipeline.

The decision of when to delete candidates that pass later stages is a bit more complex. We keep all release candidates that have made it into production. The combination of rules that I have described so far leaves us with candidates that were good enough to make it into production but weren’t selected (we release at the end of each two week iteration and so some good candidates may be skipped). We hold onto these good, but superseded, candidates for an arbitrary period of a month or two. This provides us with the ability to do things like binary-chop release candidates to see when a bug was introduced or demo an old version of some function for comparison with a new.

We have implemented these policies as a part of our artefact repository so largely it looks after itself.

Presentation on Continuous Delivery at LMAX

Dit artikel is afkomstig van een externe website.

I was recently asked to do a presentation on the topic of Continuous Delivery at the London Tester Gathering.

You can seen a video of the presentation here

In this presentation I describe the techniques and some of the tools that we have applied at LMAX in our approach to CD.

Don’t Feature Branch

Dit artikel is afkomstig van een externe website.

I recently attended the Devoxx conference. One of the speakers was talking on a topic close to my heart, Continuous Delivery. His presentation was essentially a tools demonstration, but one of the significant themes of his presentation was the use of feature-branching as a means of achieving CD. He said that the use of feature-branching was a debatable point within the sphere of CD and CI, we’ll I’d like to join the debate.

In this speaker’s presentation he demonstrated the use of an “integration branch” on which builds were continuously built and tested. First I’d like to say that I am not an opponent of distributed version control systems (DVCS), but there are some ways in which you can use them that compromise continuous integration.

So here is a diagram of what I understood the speaker to be describing, with one proviso, I am not certain at which point the speaker was recommending branching the “integration branch” from “head”.

In this digram there are four branches of the code. Head, the Integration branch and two feature branches. The speaker made the important point that the whole point of the the integration branch is to maintain continuous integration, so although feature branches 1 and 2 are maintained as separate branches, he recommended frequent merges back to the Integration branch. Without this any notion of CI is impossible.

So the Integration branch is a common, consistent representation of all changes. This is great, as long as each of these merges happens with a frequency of more than once per day this precisely matches my mental model of what CI is all about. In addition, providing that all of the subsequent deployment pipeline stages are also run against each change in the integration branch and releases are made from that branch this matches my definition of a Continuous Delivery style deployment pipeline too. The first problem is that if all of these criteria are met, then the head branch is redundant – the integration branch is the real head, so why bother with head at all? Actually I keep the integration branch and call it head!

There is another interpretation of this that depends on when the integration branch is merged to head, and this is what I think the speaker intended. Let’s assume that the idea here is to allow the decision of which features can be merged into the production release, from head, late in the process. In this case the integration branch, still running CI on the basis of fine-grained commits, is evaluating a common shared picture of all changes on all branches. The problem is that if a selection is made at the point at which integration is merged back to head then head is not what was evaluated, so either you would need to re-run every single test against the new ‘truth’ on head or take the risk that your changes will be safe (with no guarantees at all).

If you run the tests and they fail, what now? You have broken the feedback cycle of CI and may be seeing problems that were introduced at any point in the life of the branches and so may be very complex to diagnose or fix. This is the very problem that CI was designed to eliminate.

Through the virtues of CI on the integration branch, at every successful merge into that branch, you will know that features represented by feature branches 1 and 2 work successfully together. What you can’t know for certain is that either of them will work in isolation – you haven’t tested that case. So if you decide to merge only one of them back to head, you are about to release a previously untested scenario. Depending on your project, and your the nature of your specific changes, you may get away with this, but that is just luck. This is a risk that genuine CI and CD can eliminate, so why not do that instead and reduce the need to depend on luck?

Further, as I see it the whole and only point of branching is to isolate changes between branches, this is the polar opposite of the intent of CI, which depends upon evaluating every change, as frequently as practical, against the shared common picture of what ‘current’ means in the system as a whole. So if the feature branches are consistently merging with the integration branch, or any other shared picture of the current state of the system – like head, then it isn’t really a “feature branch” since it isn’t isolated and separate.

Let’s examine an alternative interpretation, that in this case I am certain that the speaker at the conference didn’t intend. The alternative is that the feature branches are real branches. This means that they are kept isolated, so that people working on them can concentrate on those changes and only those changes without worrying about what is going on elsewhere. This picture represents that case – just to be clear, this is a terrible idea if you mean to benefit from CI!

In this case feature branch 1 is not merged with the integration branch, or any other shared picture, until the feature is complete. The problem is that when feature branch 2 is merged it had no view of what was happening on feature branch 1 and so the merge problem it faces could be nothing at all or represent days or even weeks of effort. There is no way to tell. The people working independently on these branches cannot possibly predict the impact of the work elsewhere because they have no view of it. This is entirely unrelated to the quality of merge tools, the merge problems can be entirely functional, nothing to do with the syntactic content of the programming language constructs. No merge tool can predict that the features that I write and features that you write will work nicely together, and if we are working in isolation we won’t discover that they don’t until we come to the point of merge and discover that we have evolved fundamentally different, incompatible, interpretations. This horrible anti-pattern is what CI was invented to fix. For those of us that lived through projects that suffered, all to common, periods of merge-hell before we adopted CI never want to go back to it.

So I am left with two conclusions. One, for me the definition of CI is that you must have a single shared picture of the state of the system and every change is evaluated against that single shared picture. The corollary of this is that there is no point having a separate integration branch, rather release from head. My second conclusion is that either these things aren’t feature branches and so CI (and CD) can succeed, or they are feature branches and CI is impossible.

One more thought, feature-branching is a term that is, these days, closely associated with DVCS systems and their use, but I think it is the wrong term. For the reasons that I have outlined above these are not real branches, or they are incompatible with CI (one or the other). The only use I can see for a badly mis-named idea of “feature branching” is that if you maintain a separate branch in you DVCS, but compromise the isolation of that branch to facilitate CI, then you do have an association between all of the commits that represent the set of changes that are associated with particular feature. Not something that I can see an immense amount of value in to be honest, but I can imagine that it may be interesting occasionally. If that is the real value then I think it would benefit from a different name. This is much less like a branch and more like a change-set or more accurately in configuration management terms a collection of change-sets.

Organize software delivery around outcomes, not roles: continuous delivery and cross-functional teams

Dit artikel is afkomstig van een externe website.

Translations: 中文

When implementing continuous delivery, it’s easy to focus on automation and tooling because these are usually the easiest things to start with. However continuous delivery also relies for its success on optimizing your organizational structure for throughput. One of the biggest barriers we at ThoughtWorks have seen to continuous delivery is teams organized by role or by tier, rather than by business outcome. In this post I’ll address the root cause of this problem, and how to overcome it.


The Devops movement emerged from a frustration with the engineering, testing, and operations silos that have been created in IT organizations. Why do these silos exist? There is a quote from Coleen Young of Gartner that cuts to the core of this issue:

Virtually every IT organization must face a process transformation [which will] inevitably drive radical changes in organizational structure. Traditional IT service delivery and organizational models achieved efficiency at the expense of effectiveness. [At one time when computing was expensive and resources were scarce] it made sense to maximize the utilization and life cycle costs of assets… This approach to resource orchestration inevitably resulted in functional silos. The optimized process-based organization is horizontally focused on outcomes, not vertically oriented around skills1.

Creating silos is a rational response to the historical expense of computing resources and the high transaction cost of putting out a release. There is a direct relationship between batch size and this transaction cost, expressed in the Economic Lot Size equation which governs the trade-off between the transaction cost–the cost of sending a batch to the next process (say from development to testing)—-and the holding cost—the cost of not sending it. This is discussed in Donald Reinertsen’s book, The Principles of Product Development Flow (my favourite IT book of 2009)—which contains Figure 1, below (p35, discussion p121):

Figure 1

Continuous delivery: reducing transaction cost

In the mainframe era, the transaction cost of putting out a release was small. But as we moved first to client-server systems, and thence to web-based systems, the transaction costs of testing and releasing software became much higher, providing by far the biggest contribution to total cost. This drove up batch size and resulted in the silos we see today.

But the message of continuous delivery is that we now have the tools, patterns and practices to drive down the transaction cost of releasing a change enormously—to the extent that holding cost is actually a much bigger contribution to the total delivery cost.

Given this, one key rationale for creating organizational silos no longer holds. And since silos lead to lower software quality, lower production stability, and less frequent releases, there are many good reasons to get rid of them, at least for teams working on strategic projects2.

Cross-functional teams: optimizing for throughput

The alternative to silos is cross-functional teams, in which (as Young suggests) we optimize for throughput—or lead-time. By implementing the practices of continuous delivery, we reduce transaction costs to a negligible level. Next, we create very small batches of work by limiting work in process and focus on getting that functionality either released to users, or at least releaseable to users, in the shortest time possible.

Cross-functional teams are not a new idea, but their importance is often under-estimated. Having a cross-functional team is a vital ingredient for reducing lead-time because much of the delay getting functionality released is incurred through communication overhead. When everyone involved in the delivery of a product or service is co-located, developers can call over testers and show them what they’ve been working on so that they can get fast feedback, testers and developers can collaborate on creating functional acceptance tests, developers can discuss proposed schema changes with dbas before they make them, and architectural trade-offs can be discussed with an infrastructure expert before any code gets written.

This not only makes software delivery much faster, but also much more fun. And of course we end up with higher quality software, lower-risk releases, and much faster responsiveness to our customers. These considerations were key in Amazon’s decision to move to a service-oriented architecture with cross-functional teams.

There are a couple of common objections to cross-functional teams. One is that it is prohibited by a control known as segregation of duties when your organization is subject to regulations such as Sarbanes-Oxley and PCI-DSS. In fact, segregation of duties can be implemented more effectively within cross-functional teams, as described in an article I recently co-authored for Cutter IT Journal (registration required).

Annother objection is the additional cost of creating cross-functional teams and implementing continuous delivery. This is true, which means that you only want to incur this extra cost for the part of your service portfolio that is strategic. The additional cost in this case is paid back many times by the benefit of getting to market faster, learning from your users and iterating rapidly, and avoiding building functionality that is not useful (the biggest source of waste in software development).

How should organizations move to a cross-functional model?

If you have a small IT organization, say less than thirty people, you can move to a cross-functional model in one go. With larger organizations, as with all things in continuous delivery, you should proceed iteratively and incrementally.

Start with a pilot product. Make sure you’re measuring total cost and revenue delivered over the lifecycle of the product, in addition to metrics such as cycle time and the incremental value delivered by every new release. The team will need to own the SLA for the service, and crucially should be able to self-service their own infrastructure, including testing and production environments, without having to wait days or weeks for hardware to be provisioned. The team should be given plenty of time to implement continuous delivery, and focus on delivering a minimum viable product and then iterating rapidly in response to real data from users.

Large organizations with distributed teams are actually a great target for moving to a cross-functional model: the key here is simply to ensure that teams are not grouped by role. Nothing kills continuous delivery of high quality software faster than having a development team in one country, a testing team in another, and the operations team in a third.

Finally, teams should focus initially on verifying their business hypothesis by devising the smallest possible minimum viable product to test it, and pivoting in the case of failure.

Thanks to Don Reinertsen for permission to use figures from his book, The Principles of Product Development Flow. Thanks to Don Reinertsen, Pat Kua, Dutch Steutel, Dennise Openshaw, and Chris Hilton for feedback on an earlier draft of this post.

1Colleen M. Young, “Six steps to process-based IT organizational design”, Stamford, CT: Gartner, 2006, via Steve Bell, Enterprise Agility (forthcoming).

2Note that some of the problems of silos can be mitigated through better management, such as ensuring resources within silos are not excessively loaded, and rotating people through different groups to encourage collaboration and understanding.

Continuous Delivery is set text for Agile Engineering Practices course at Oxford University

Dit artikel is afkomstig van een externe website.

I am delighted to report that Continuous Delivery is being used as the set text for the Agile Engineering Practices course, which forms one of the modules for the Software Engineering MSc at Oxford University.

This is especially sweet for me since I did my BA at Oxford. It’s also where I first got into systems administration. I accidentally reformatted my hard drive, and I couldn’t get hold of a copy of Windows. Instead, I popped down to computing services and picked up some new free operating system distribution called RedHat, so I could write my philosophy essays (using emacs, of course).

Thanks to Dr Robert Chatley (@rchatley), who teaches the course, for letting me know. He says he chose Continuous Delivery since it “gave the best motivation for putting all the technical practices together … [it] gave the big picture of what we were trying to do – minimise the cycle time from idea to delivery, and allow that cycle to be repeated frequently and reliably”. He has a write-up of the course here.

He was kind enough to let me reproduce a picture of the students with their course text. I wish them all the best with their future endeavours.

Agile Engineering Practices class at Oxford

On DVCS, continuous integration, and feature branches

Dit artikel is afkomstig van een externe website.

Translations: 中文

I like to say that feature branches are evil in order to get people’s attention. However in reality I lack the determination and confidence to be a zealot. So here is the non-soundbite version.

First, let me say that Mercurial (and more recently Git) has been my workhorse since 2008, and I love distributed version control systems. There are many reasons why I think they represent a huge paradigm shift over existing tools, as discussed in Continuous Delivery (pp393-394). But like all powerful tools, there are many ways you can use them, and not all of them are good. None of my arguments should be construed as attacking DVCS: the practice of feature branching and the use of DVCS are completely orthogonal, and in my opinion, proponents of DVCS do themselves – and the tools – a disservice when they rely on feature branching to sell DVCS.


First a few definitions. Note that some people use these terms in different ways, so you’ll need to temporarily erase any other definitions from your brain or my discussion won’t make much sense.

Continuous Integration is a practice designed to ensure that your software is always working, and that you get comprehensive feedback in a few minutes as to whether any given change to your system has broken it.

Feature branching is a practice whereby people do not merge their code into mainline until the feature they are working on is “complete” (i.e. done, but not done done1).

Mainline is the line of development – on a conventionally designated version control repository – which is the reference from which the builds of your system or project are created that feed into your deployment pipeline. Note that this definition applies perfectly well to DVCS and to open source projects, even on GitHub.

First, let’s dismiss the straw man argument. Every time you use version control you are effectively working on a branch: your working copy. On a DVCS, there’s a further level of indirection, because your local repository is effectively a branch until you push your changes to mainline. I have no problem with creating branches. What I do have a problem with is letting code that you ultimately want to release accumulate on branches.

Here are my observations. When you let large amounts of code accumulate off mainline – code that you ultimately want to release – several bad things happen:

  • The longer you leave it, the harder it becomes to merge, because as other people check in to mainline, mainline diverges from your branch. Really awesome merging tools help with this to some extent, but anyone who has done much programming has experienced situations where the code merged successfully but the application broke. The probability of this happening increases substantially – more than linearly – as the amount of stuff you need to merge, and the time between initial branch and final merge, increases.
  • The more work you do on your branch, the more likely it is you will break the system when you merge into mainline. Everyone has had the experience of getting in the zone and running with what seemed like a genius solution to your problem, only to find hours – or days – later that you need to scrap the whole thing and start again from scratch, or (more subtly and more commonly) that your check-in resulted in unintended consequences or regressions
  • When you have more than a handful of developers working on a codebase and people work on feature branches, it becomes difficult to refactor. If I refactor and check in, and other people have significant amounts of stuff on branches, I make it much harder for them to merge. This is a strong force discouraging me from refactoring. Not enough refactoring equals crappy code.

These problems go away when people regularly merge their work into mainline. Conversely, they become exponentially more painful as the size of your team increases. Furthermore, there’s a vicious circle: the natural reaction to this pain is to merge less often. As I am fond of saying, when something hurts, the solution is to do it more often, and to bring the pain forward. In this case this is achieved by having everyone merge to mainline more frequently.

However, it’s hard to do this if you’re working on a feature that involves a lot of work, or if you’re working on a large-scale change to your system. Here are some solutions.

  1. Break down your stories into smaller chunks of work (sometimes referred to as tasks). I have never yet found a large piece of work that I couldn’t split into smaller chunks – usually less than an hour and almost always less than a day – that got me some way towards my goal but kept the system working and releasable. This involves careful analysis, discussion, thought, and discipline. When I can’t see a way to do something incremental in less than a couple of hours, I try spiking out some ideas2. Crucially though, it means I get essential feedback early on as to whether my proposed solution is going to work, or whether it will have unintended consequences for the rest of the system, interfere with what other people are working on, or introduce regressions (this is the motivation for continuous integration.)
  2. Implement stories in such a way that the user-facing bits are done last. Start with the business logic and the bits further down the stack first. Grow your code using TDD. Check in and merge with mainline regularly. Hook up the UI last3.
  3. Use branch-by-abstraction to make complex or larger scale changes to to your application incrementally while keeping the system working.

How do you know when you’ve got too much unmerged stuff? Here’s a thought experiment. Imagine you’re the maintainer of an open source project, and someone you don’t know just submitted what you have locally on your branch as a patch. Would you merge it? Is the unified diff with mainline more stuff than you can easily keep in your mental stack when you read it? Is your intent sufficiently clear that someone else on your team could understand it in a minute or so without having to ask you? If you can’t answer “yes” to all these questions, then you need to stop working, stash your work, and split it into smaller chunks.

It should be clear that I’m not really attacking feature branches, provided your “features” are sufficiently small. However generally people who use feature branches overwhelmingly fail the test in the last paragraph, which is why it makes for a nice soundbite. Really experienced developers understand the trade-offs that using feature branches involve and have the discipline to use them effectively, but they can still be dangerous – GitHub is littered with forks created by good developers that are unmergeable because they diverged too far from mainline.

The larger point I’m trying to make is this. One of the most important practices that enables early and continuous delivery of valuable software is making sure that your system is always working. The best way for developers to contribute to this goal is by ensuring they minimize the risk that any given change they make to the system will break it. This is achieved by keeping changes small, continuously integrating them into mainline, and making sure there is a comprehensive suite of automated tests to verify that changes behave as expected and don’t introduce any regressions.

What about feature toggles?

See how I haven’t even mentioned feature toggles yet? Feature toggles don’t even come into play unless you have a complete, user-visible feature that you don’t want to appear in your next release. In this situation, the feature-branch alternative is to keep your feature branch unmerged until after your release. Unless you’re doing continuous deployment, or working on a small and experienced team, this is a painful and risky proposition.

However another (perhaps more important) use of feature toggles is to reduce the risk of release, and to increase the resilience of your production systems. The most important part of release planning is working out what to do when things go wrong (this is known as “remediation” in ITIL circles). Re-deploying an old version is usually what people opt for, but having the ability to turn off problem features without rolling back the whole release is a less risky approach. In terms of resilience, an important technique is the ability to gracefully degrade your service under load (see John Allspaw’s 40m talk at USI for a masterful discussion of creating resilient systems). Feature toggles provide an excellent mechanism for doing this.

For people who are skeptical about feature toggles, or interested in finding out more, I highly recommend that you look at Facebook’s video on release management (look for the section on “Gatekeeper”). Sarah Taraporewalla also just wrote an experience report on using feature toggles.

What about cherry picking?

Some people recommend keeping features out of mainline until they’re ready to be released, perhaps keeping them on a development branch that developers check in to, and then cherry-picking them in. However assuming you’re following the guidelines I provide above and your stories are small, the need to take features out is very much the exceptional case, not the normal case.

Furthermore, you then face all the problems that I mention elsewhere of getting from done to done done1 – the pain of integrating, regression testing, performance testing and so forth. With continuous delivery, you completely get rid of any integration or testing phases. In my experience, unless you have a small, experienced team working on a well-factored codebase with plenty of automated tests, these benefits massively outweigh the pain of occasionally having to take a feature out – and feature toggles provide a cheaper alternative if your analysis is done right.

Of course a key assumption here is that your stories are small and don’t spatter stuff all across the UI. I’ll be discussing analysis in the context of continuous delivery in my next blog post.

1 A feature that is dev complete is “done”. A feature that is released is “done done”. One of the axioms of continuous delivery is that much of the pain and risk in releasing software occurs after software is “done”, particularly if your work isn’t sitting on mainline and needs to be merged. Thus “saving” your work on a feature until it is “done” doesn’t really make sense. Some people recommend not merging until after a feature is tested and showcased, but this seriously exacerbates the problems described below without provide much additional benefit, since tested, showcased features are still not “done done” (consider the need to integrate your code and run regression tests, for example).
2 Spiking is the practice of writing some code that you will throw away to test out an idea. The output of a spike is knowledge.
3 I am not trying to imply you shouldn’t prototype the UI early on.