# The essential content of the Threshold Theorem for Fault-Tolerant Quantum Computation

The belief that quantum computation is possible in practice is founded strongly on a set of theorems that are generally called ‘the’ Threshold Theorem for Fault-Tolerant Quantum Computation. Though there is non-trivial variation between threshold theorems based on differing assumptions about the errors that will affect the quantum computer, the essential content of ‘the’ theorem is that errors can be efficiently corrected if they are small enough.

‘Smallness’ is a vague term. Different performance metrics may disagree about whether one type of error is truly ‘larger’ than another, so discussion on the smallness of error can easily descend into meaningless apples-versus-oranges comparisons. The distinction between performance metrics is nonetheless tangential to the essential content of the Threshold
Theorem because the theorem guarantees only existence. Indeed, the actual threshold value depends not only on the choice of performance metric but also on the nature of errors that can affect the quantum computer as well as the choice of universal gate set. To the best of
my knowledge, no reliable technique for estimating a threshold has ever been established.

My purpose is to explain the ‘essential content’ of ‘the’ Threshold Theorem. By ‘essential content’, I mean those aspects of the theorem which justify a belief that quantum computing is indeed possible in practice. I divide this essential content into two categories: promises
about the noise affecting a device and the guarantee that a given quantum circuit can be altered so as to be robust against the promised noise model. This essay discusses only the second point; the first is rather more complicated. I may discuss noise in a follow-up essay.

For the sake of definiteness, I focus on a specific (though somewhat rough) formulation of the theorem given by Aharonov and Ben-Or in 2008:

There exists a threshold η 0 > 0 such that the following holds. Let ε > 0. If Q is a quantum circuit operating on n input qubits for t time steps using s two- and one-qubit gates, there exists a quantum circuit Q’ with depth, size, and width overheads which arepolylogarithmic in n, s, t, and 1/ε such that, in the presence of local noise of error rate η < η 0 , Q’ computes a function which is within total variation distance ε from the function computed by Q.

Thus a given function can be computed to within a specified total variation distance with polylogarithmic overhead in the presence of “local noise” with a small enough error rate. The guarantee of the Threshold Theorem is that “small enough” is non-zero. More formally, the theorem guarantees that a quantum circuit Q’ can be found for every quantum circuit Q such
that Q and Q’ are ε-equivalent and the overhead of Q’ with respect to Q is polylogarithmic.

To explain the meaning of ε-equivalence, I must first point out that a quantum circuit is a set of instructions about how to operate on a quantum register. Although Q and Q’ are distinct instructions, they are considered approximately equal if they execute almost the same function on the input. The nearness of the two functions can be measured in a variety of ways, but ε refers to a particular metric on the space of functions that Aharonov and Ben-Or call the ‘total variation distance’. This appears to be a minor mistake in the paper.

Aharonov and Ben-Or refer to total variation distance as a metric on the space of probability distributions or, more generally1, over the space of density matrices. This definition does not apply directly to functions. My understanding is that Q and Q’ are computing the functions f and f’, respectively, on a register with a state represented by x. Thus f(x) and f’(x) are states of the output register that can be represented as density matrices. The total variation distance between f(x) and f’(x) is well-defined, but the total variation distance between f and f’ was not
clearly stated as far as I could tell (though I may have missed something). When Aharonov and Ben-Or refer to the total variation distance between f and f’, they seem to be referring to the maximum over x of the total variation distance between f(x) and f’(x). This is a perfectly reasonable metric on the space of functions over a finite domain, but it can easily be
conflated with the total variation of a function — an inequivalent concept.

Thus ε expresses the maximum possible deviation between the output of Q and the output of Q’. The Threshold Theorem is then guaranteeing that a given circuit Q can be cost-efficiently approximated by a circuit Q’ to within ε total variation distance.

What does it mean for Q’ to be cost-efficient? I first need to explain the meaning of cost for a quantum circuit, which is analogous to the cost of a classical circuit. A circuit is a labelled directed acyclic graph, each of whose nodes represent a logic gate. The size of a circuit is the number of nodes of the graph, the depth of the circuit is the longest path between an input
and output node, and the width of a circuit is (roughly) the maximum number of gates to be executed simultaneously. The theorem then states that the cost of Q’ is polylogarithmic (i.e., a polynomial of the logarithms of the variables) in the cost of Q as well as in 1/ε.

Based on the above discussion, I can summarize much of the content of the Threshold Theorem as a statement that Q’ can be found for every Q and ε > 0 that is cost-efficient and approximates Q to within total variation distance ε in the presence of small local noise.

Though I am avoiding a full discussion of “small local noise”, I at least point out that local noise is noise that affects each gate independently. This locality assumption together with a guarantee that the effect of noise is small enough is enough to ensure that Q’ operates correctly even in the presence of such noise. Not only are Q’ and Q are ε-equivalent, but Q’
plus noise of the promised form is ε-equivalent to Q. Furthermore, the amount of tolerable noise is independent of the target total variation distance ε.

1 The relationship between the definition of total variation distance used by Aharonov and Ben-Or was established by Fuchs and van de Graaf, though they call it the “Kolmogorov distance”.

Nice posting. As an engineer, I feel like the theory is pretty well developed, but we need more work on the implementation, including optimization of syndrome (stabilizer) extraction circuits, and better ways to optimize for differing error rates. How do we match to a specific architecture and workload, to achieve a particular, desirable logical gate error rate?

Thanks for the question! The question is as difficult as it is important, so I'll try to do it justice.

I absolutely agree that there is a great deal of work left to be done on the implementation side of the matter. For this reason, there cannot yet be a single answer to the question of matching architecture and workload to observed logical error rate: we simply do not know enough about the technology that will underlie any working quantum computer. From the perspective of fault-tolerance thresholds, I would say that the biggest challenge is to produce strong but correct assumptions about the form of logical errors that affect real devices. Though I did not make this point in my blog post, such assumptions bear a strong relationship with the relevant threshold value. Another point I did not make is that the threshold value also depends on the choice of code, which restricts the set of possible fault-tolerant circuits Q' that can be produced given a target circuit Q.

From what I have seen of your research (e.g. this PDF), I guess that you agree that the problem of establishing performance targets for actual devices is architecture-dependent. I would add to this story that computer architectures are based on models of computing and that logical error rates are derived ultimately from the underlying model. In my opinion, the job of the scientist (as opposed to the job of an engineer like yourself) is to establish the properties of this underlying model. The real value of the Threshold Theorem is, to me, that it places important demands on the kind of model on top of which a fault-tolerant architecture can be built. In other words, I think the real importance of the Threshold Theorem is that it establishes the possibility of fault-tolerance if noise is well-behaved enough. Whereas this is an important implication, it is of limited (but non-zero) value for engineers.

Although I focussed in this blog post on the mathematical definition of thresholds, I think it is misguided to treat the Threshold Theorem as an indicator of a performance target of any kind. It is that thorny first category of noise promises that is the important message of the Threshold Theorem -- in my opinion, of course! I think it is work like yours that will establish the proper performance targets for real devices, not the abstract considerations of 'the' Threshold Theorem.

## Blog topics

1. 2016 (21)
1. November (1)
2. October (1)
3. September (2)
4. August (3)
5. July (2)
6. June (2)
7. May (2)
8. April (2)
9. March (3)
10. February (2)
11. January (1)
2. 2015 (16)
1. December (1)
2. November (2)
3. October (2)
4. September (2)
5. August (2)
6. July (2)
7. June (3)
8. May (2)