7 minute read

There’s no getting away from it, quality is a whole team responsibility. If you’re aiming for Continuous Delivery, then you’ll recognise one of the core principles of Continuous Delivery is to “Build quality in”.

If you’ve heard of lean development, then you will no doubt have heard of the principle of “The Toyota Production System” of “building quality into” software.

It’s inevitable then, that there will be a tension between filling that “QA” role and building a team of “T-Shaped people” who treat quality as a first class citizen.

Put quality first

“Cease dependence on inspection to achieve quality. Eliminate the need for inspection on a mass basis by building quality into the product in the first place.” - Dr. W. Edwards Deming

Quality is not something we should just tack on the end, it should help us to “shift left” – unfortunately, that’s often exactly where we seem to identify that support is needed – at the end, making sure the developers have met the acceptance criteria and haven’t introduced any bugs. You have to ask, why?

To get new features to market quickly, we often trade off quality for higher velocity. This is a sensible and rational decision. But at some point, the complexity of our systems becomes a limiting factor on our ability to deliver new work, and we hit a brick wall.

When exploring, there is a tension between the need to experiment by building MVPs, and building at high levels of quality through practices such as test automation.

Test automation is still controversial in some organisations, but it is impossible to achieve short lead times and high-quality releases without it.

“The paradox is that when managers focus on productivity, long-term improvements are rarely made. On the other hand, when managers focus on quality, productivity improves continuously” - John Seddon, inventor of the Vanguard Method

Statistical analysis revealed that when engineering teams hold themselves accountable for the quality of their code through peer review, lead times and release frequency improved considerably with negligible impact on system stability.

“The difficulty in defining quality is to translate future needs of the user into measurable characteristics, so that a product can be designed and turned out to give satisfaction at a price the user will pay”. - Walter Shewhart, known as the father of statistical quality control

Werner Vogels, CTO of Amazon, says, “You build it, you run it”. This, along with the rule that all service interfaces are designed to be externalizable, has some important consequences. As Vogels points out, this way of organizing teams “brings developers into contact with the day-to-day operation of their software.

It also brings them into day-to-day contact with the customer. This customer feedback loop is essential for improving the quality of the service.”

Influence the culture

NUMMI was a broken organisation was reformed under a new leadership and management paradigm. Despite rehiring the same people, NUMMI achieved extraordinary levels of quality and productivity and reduced costs. In an article for MIT Sloan Management Review, John Shook, Toyota City’s first US employee, rected on how that cultural change was achieved:

What my NUMMI experience taught me that was so powerful was that the way to change culture is not to first change how people think, but instead to start by changing how people behave—what they do.

Those of us trying to change our organizations’ culture need to define the things we want to do, the ways we want to behave and want each other to behave, to provide training and then to do what is necessary to reinforce those behaviors. The culture will change as a result… What changed the culture at NUMMI wasn’t an abstract notion of “employee involvement” or “a learning organization” or even “culture” at all.

What changed the culture was giving employees the means by which they could successfully do their jobs. It was communicating clearly to employees what their jobs were and providing the training and tools to enable them to perform those jobs successfully.

As we do with functional and performance quality, we build evidence of compliance into our daily work so we don’t have to resort to large batch inspections after most of the work has been done.

Make it OK to fail

There are people that are really good at acceptance testing and exploratory testing, that have a keen eye for issues and aesthetics which often developers just simply lack, plus they often come at a piece of work from a different angel. A person in a QA role are a good sounding board and a valuable member of the team, and would help improve automated testing.

However, it can mean that someone in a QA role are at risk of becoming a bottleneck, a gate. Adding more capacity to the bottleneck isn’t the solution here. There’s many ways you can improve throughput without adding more resources to a bottleneck.

A QA should not be a blocker to getting work released, otherwise they are a gate and gates become bottlenecks. The question is, can the system survive without a QA present?

It has to be OK to fail. It’s easy to recognise that developers don’t like to fail or even get things wrong. By always relying on a QA to qualify work, having someone who will always catch the issues before they go out will mean that, as developers, we will never learn. Instead, developers will need to take ownership of their work, end to end. If it goes wrong, it should be on them to fix it, nobody else.

Sure, I’m not suggesting to adopt Facebook’s “move fast and break things” mantra, but you can make it safe to fail so long as you learn. It’s OK to fail, so long as we continuously improve and quality goes up. It’s easy to get stuck on quality at the last mile, which often means we don’t stop to figure out how to improve our processes upstream.

Root cause analysis is a great example of the type of learning that can be done after a failure. What we find is that they did do the right thing the right way and evaluated the risks at the start, but they still failed. The problems were in quality control – the process. Therefore it’s a process that needs improvement, not the people.

Behaviours not roles

It’s no surprise that by hiring someone to fill a QA role, the process may not improve and lead time can actually get worse. That’s why it’s key to hire people who will continue to improve processes and lead time so that quality can improve.

My view is that the better the lead time, the faster we can learn, the faster we can deal with problems, the faster we can get better and fix the process.

Not hiring a QA is a contrarian view but it’s something that every team needs to consider. That doesn’t mean fire your QA, no. It means you need to have people who can do the work you need to do, not just tick the “QA” box. There needs to be “quality advocates” on your team, remembering that quality is a whole team responsibility.

If you need to invest in automation testing, then it’s a good idea to hire people who have that skill set or invest in your existing team and train your QA to do more than just manual testing.

“If you can’t measure it, you can’t improve it” – Peter Drucker.

If you’re trying to decide whether to hire a QA or not, it’s a good idea to agree on a definitive way to measure success, quality and lead time before hiring as that will tell you if it’s working or not.


Here’s some further reading on the matters discussed in this article.