The invisible, intangible world of software production may seem a world away from the nuts and bolts and physical labour of building cars... But out for a run and listening to the excellent ‘This American Life’ podcast on the NUMMI production plant, it struck me that in fact there are some striking similarities….
Cars: GM and Toyota
In the 1980s, Toyota and General Motors came together in a joint venture to produce a small car.
General Motors were required to produce a small car due to US emissions standards at the time. Producing these cars had always been an issue for GM; the resulting cars were unreliable and the model was never profitable.
Toyota, at the same time, were looking for a way into the US car market. The US heavily restricted foreign car imports, and a factory located within the US would allow Toyota to build and sell cars there.
GM were at the time the world's largest car manufacturer, with an ethos built around producing a the highest possible quantity of cars, fixing production issues with individual cars after they had been fully assembled.
Toyota, in contrast, employed a Japanese production ethos that emphasised quality over quantity, fixing issues found on the production line there and then; producing cars built with the highest possible quality.
And so in 1983 the joint venture, NUMMI, was struck. A factory in Fremont California which had shut down the previous year was selected.
Prior to the NUMMI, the GM Fremont factory was rife with absenteeism and there was friction between management and line workers. Unions could shut the production line at a moment's notice. Absenteeism became so bad that people had to be pulled in from the streets and bars to fill in just to keep the production line going. Ultimately, the plant was shut down, with the entire workforce struck off.
The Fremont plant was to be run as a Japanese plant. The exact same union leadership was hired to run the NUMMI plant; in fact, 85% of the previous workforce were re-hired. Groups of 30 employees were flown to Japan to be trained for 2 weeks at a Toyota plant in the Japanese method of manufacturing. It was here that the differences between GM and Toyota manufacturing practises really came to light.
The key to the Toyota production method was teamwork. Teams of 4-5 people would work for 4-5 hours on a given part of the production line and then rotate onto another area to prevent monotony. At any time a supervisor was ready to help or relieve staff if any issues arose.
If an issue arose whilst fitting a part and a worker started to lag behind on the target time, help would be offered by a supervisor, and crucially, afterwards, the worker would be asked if they had any ideas to help prevent the issue from happening again. And their response was fast. A modified tool could be produced within hours to help fit a part to a car. In fact employees were given a bonus for suggesting improvements to the process.
The production line could be stopped at any time. A cord running along the production line, within reach of all the workers could be pulled in the event of a problem; the andon cord. This would alert a supervisor to assist a worker. If the issue could not be fixed, the production line would be stopped until it was.
If an issue is not fixed on the production line, it has to be fixed afterwards, after the rest of the car is built. If that issue is deep down under 5 layers of bonnet, chassis and engine, it’s going to be difficult to rectify. It’s likely the person fixing the issue does not have the same expertise at the original fitter, which can introduce even further problems.
This is all in stark contrast to the environment in which the Fremont plant was run before NUMMI. The production line could never stop. It was a sackable offence. If an issue was encountered, it would have to be fixed after production, on the car park.
Cars would roll off the production line without steering wheels, with engines put in backwards. In fact cars would need to be towed off the production line.
If a worker fell behind their target time, they would be chastised rather than assisted. This of course lead to many production faults add to cars along the line.
The emphasis was on volume. Issues would be fixed afterwards.
The NUMMI plant had one of the lowest defect rates in vehicles produced in the US, from the outset. After 3 months, cars rolled off line with a near perfect quality rating. Grievances and skiving plummeted, workers enjoyed their work.
Over time, all the small improvements made at the Toyota plant accrue into a very efficient, successful production process. Trust between management and employees flourishes, and employees feel the benefit.
But what about software?
So that’s all good, but how does this relate to software? In fact many production methodologies used in software today originate in the car industry. Notably the kanban system and the notion of continuous improvement.
What do we do in software to emphasise quality over quantity? As with building cars, quality is an investment you make in a product up-front, and the investment that you make pays off over time.
We test our software. We test our software manually, and we test our software with software. What this means is we write automated tests.
Before we write the code to add a feature, we write a test. We start with a failing test, and then we write code to provide that feature, and the test passes.
At the highest level, this will be a test which will open a web browser, and actually start to click the buttons and fill in the forms of the web-application we are building, checking that what happens when do this is what we want to happen. That test then lives as long as the feature that it tests does.
Every time a new feature is added, all of the tests are run. This is how we know that the new feature we have added works, and it has not broken any other features. We spend the time to write the test once, and the investment in that time is paid back during the life of the software.
What I’ve described above is known as an ‘end to end’ test. The entire system is built, and run in a test mode. The automatic tests we write will have the exact same actions as a user sitting in front of a computer accessing a feature. From the end user to where the information a user would type in will end up.
What if my new feature breaks a test? We have a zero defect policy. What does this mean? If a defect is introduced, at least one team member will stop what they are currently working on and fix it.
Typically this will be the worker that introduced the defect, as they will have the expertise in the area of the defect.
It’s fixed immediately because the area of the code where the defect lies will be fresh in the mind of the developer; they will know how it works, how it should work, and how to fix it. I struggle to remember what I had for my dinner yesterday! Remembering the complex interactions of a piece of software I wrote a month ago is near to impossible, so I’d have to re-discover how it works again.
This is a further example of investing early in the quality of software. If the defect introduced is not seen as critical, it can be tempting to leave it whilst new functionality is added. This of course would mean it costs more time and money to rectify later on.
Ok, that’s great, tests are run, but where and how?
Typically we will use a system to build our software and run the tests automatically, whenever a new feature is introduced - a continuous integration server. You can see what projects are being built, and the results of the tests by accessing a web page.
A small excerpt, but you can see the projects listed, and with no defects :) If any of the projects failed to build, or had test failures they would be listed with a red icon, and somebody would be working on a fix!
To ensure the state of the projects we are working are visible, we also output this information on screens visible inside our offices, information radiators:
If a project turns from green to red, a sad trombone plays. If a project turns from red to green, a happy vuvuzela plays.
Green screens mean a happy team and a content client.
Supporting your team
Like the supervisor supports a team on the production line, a technical lead on each team provides support for the developers that make up the team. Ready to step in and assist with issues, both during production of code and running systems. Trust is established between the developers and the team lead, and problems are solved with collaboration and cooperation.
We use the same notion of continuous improvement as that of Toyota. If a team member spots something that could be improved, they are empowered to do so. This has lead to many improvements to our software development process, and will continue to do so. An easy example of this is of course our information radiator screens.
Wrapping it up
So these are a few of the main ways we ensure our software practices have an emphasis on quality.
- Quality software will ultimately have fewer defects, leading to happy users.
- Quality software will be underpinned by tests.
- Software underpinned by tests will be easier to change and adapt over time.
- Software than can easily change and adapt will last longer, and the up front investment will be well rewarded.