How to “fail fast and fail often”

5 min readJan 15, 2021

Introduction

In this article I describe how this seemingly provocative concept is actually a template for up-leveling development to attain high-velocity and efficient use of people-time.

Although this concept can be applied to any phase of development, I focus on the front-end phases which define architectures and the direction for development because this is the phase which has the most impact on downstream teams. I will show that although this is a powerful paradigm, the requisite capabilities are not trivial and need to be grown and developed.

Not really Failure

If Thomas Edison “failed” 10,000 times so he can “succeed” once in taming incandescence, it would suggest that even a great inventor with copious amounts of grit is “failing” most of the time. The reality is that none of these were true failures. These were just steps in the process of his relatively expedient two-year journey.

If such failures are merely steps in a sequence to reach success, then what constitutes a true failure?

True failure is a determination of a result which did not meet a desire within a required timeframe.

The determination of failure is made by a beholder, which could be you, your manager, a customer, the market, or even society. Had Edison taken another year to invent a production-quality bulb and no other competitor had beat him to it, Edison would have still succeeded. Success is thus time-bound.

The phrase “fail fast and fail often” is of course an idiom that suggests a way of innovating and developing which ironically intends to avert true failure altogether because timely success is still the goal. Failure, by the idiomatic sense simply constitutes a measurable outcome.

Failing Fast … and efficiently

Failing fast is really about the ability to explore a design space by iteration in an efficient and sufficiently expedient manner.

Sufficiently expedient is a function of how often one fails in relation to the time-bound constraint. Edison was an incredibly fast iterator. He had his own glass blowing shed and had a steady stream of suppliers around the world who systematically sent him materials to test as candidate filaments.

Efficiency is about doing something with little waste and utilizing few resources. This is where the Edison analogies end because

a reliable measurable outcome should require a fraction of the total time and effort required to manifest the product.

The “product” here constitutes your deliverable, not necessarily the final end-product.

Teams which fail the fastest, have the following capabilities:

Experienced optimists who open more doors than they close. This is first line of defense and opportunity where most of the failures should occur. Highly experienced innovators have effective “brain simulators” that are just as capable of seeing an opportunity as identifying paths laden with peril. An experienced optimist might say things like “yes, we tried something like that five years ago. It failed, but some assumptions have changed, so go ahead and see where it takes you”.
Abstract models and prototyping environments which provide useful insights, quickly. Usually developed and simulated by a small team. This is the second line of defense. Many failures can and should also occur in this loop. A good example of this is the use of virtual platforms and high-level models for embedded systems.
Front-loaded “canaries in the coal mine” during development which can predict undesired outcomes early in the process. This is the last line of defense which may require resetting the development loop back into the prototyping loop. These “canaries” usually require a “shift-left” methodology where initial designs can be vetted earlier in the development phase. An example “canary” could involve test-driven development.

It should be noted, that

nothing here requires rushing, reducing quality, cutting corners, removing features, thrashing around, or working your team to burnout.

It is about working “smart”, by up leveling the capture and simulation of design-intent while dealing with well understood randomness up-front. This is no simple feat and requires building a team that can attain this level of capability. It is thus an investment.

Failing Often … but with direction

Minimizing mean-time-to-failure alone is not sufficient however:

The cumulative time of failures must be within the desired timeframe to avoid a true failure.
The path of failures towards the successful outcome must be self-guiding with an overarching directive which can manifest itself in the early iterations.

The first point is rather obvious. The second point is the less obvious aspect.

To illustrate this point, consider machine-learning, and more specifically neural networks. Neural networks are perhaps the most ubiquitous types of machine-learning structures today. A common type of neural network is a classifier which is trained with examples. Consider a classifier for images such as “dog”, “cat”, “plane” and so on. How does such a network get trained to perform classifications reliably and with new data it has never seen before?

The training of a neural network involves a systematically controlled steering of a “direction” towards the objective of correctness. Throughout the training sequences, the optimizer is “failing” towards the solution by way of measuring the gradient of convergence towards correctness in relation to the network parameters. This is important, because in theory, training a neural network could be an incredibly exhaustive exercise even for the simplest network. The optimizer essentially has a guiding compass that tells it whether it is moving towards the objective or not. This is an abstraction (and a noisy one at that), and that is ok.

Models, abstractions and early “canaries in the coal mine” are all somewhat random processes. They do not provide a definitive measure of success and so for these early iterations to be indicative of true failure (or success) further downstream,

these models must provide not only whether the outcome is a pass or fail, but more so they must indicate a direction to take for the next iteration.

Without such a compass, you will run out of tries, because you can only fail so often for any given project before reaching “true failure”. What such a compass constitutes depends on the problem at hand, but telemetry and intelligent processing thereof, is a generally great pattern to gain inspiration from.

Succeeding fast and succeeding often

Failing fast and failing often requires capabilities, and purpose in people and the development process with a bias towards early prototyping which is fast, provides a sufficiently reliable measurable outcome, and requires exposing the churn to fewer people. Fewer people and shorter periods of time means efficiency.

Each iteration should provide new information to remove randomness from the process and provide a direction or hint for the next iteration thus minimizing the number of iterations required. Once established, such machinery can enable a team to experience a high-velocity and frequency of successful outcomes and a reduction in exposing larger downstream teams to paths of peril.