of 5
3094
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 11, NOVEMBER 2003
On the Maximum Tolerable Noise of
-Input Gates for
Reliable Computation by Formulas
William S. Evans
, Member, IEEE
and Leonard J. Schulman
Abstract—
We determine the precise threshold of component noise below
which formulas composed of odd degree components can reliably compute
all Boolean functions.
Index Terms—
Computation by unreliable components, reliable com-
puting.
I. I
NTRODUCTION
We consider a model of computation that was first proposed by von
Neumann in 1952 [1]: the
noisy circuit
. A noisy circuit is composed
of

-noisy gates. An

-noisy,
k
-input gate is designed to compute a
Boolean function of its
k
Boolean inputs; however, it has the property
that for any assignment to the inputs, there is probability

that the
output of the gate is the complement of the designed value. In a noisy
circuit this event occurs independently at every gate of the circuit.
A circuit takes
n
Boolean values as input and produces one Boolean
output. The inputs to a gate in the circuit may be the outputs of other
gates in the circuit, inputs to the circuit, or the constants
0
or
1
. The
output of the circuit is the output of one of the gates called the top gate.
The interconnection pattern does not allow “feedback.” That is, the
interconnection structure of the circuit is a directed, acyclic graph: ver-
tices of the graph correspond to gates, and a directed edge from
u
to
v
corresponds to gate
v
taking as input the output of gate
u
. If, in ad-
dition, the graph forms a tree (each vertex having one outgoing edge)
then we call the circuit a
formula
.
Clearly, a noisy circuit with
>
0
cannot deterministically compute
a Boolean function
f
; on any input there is probability at least

that
the top gate will output the complement of
f
. (We assume without loss
of generality that


1
=
2
.) The
error probability
of a noisy circuit for
a Boolean function
f
is the maximum over all inputs of the probability
that the circuit’s output differs from the value of the function. If this
maximum is at most

then the circuit
(1

)
-reliably
computes the
function.
Fixing
k
, we are interested in the maximum value of

for which it
is possible to have
reliable computation
, which we define as: there is a
<
1
=
2
so that for every Boolean function there exists a noisy circuit
using arbitrary

-noisy,
k
-input gates, that
(1

)
-reliably computes
the function. The word
reliable
in this context does not mean perfectly
accurate, but rather that the output of the noisy circuit is biased, by a
fixed amount, toward the correct output on every input.
The need for a limit on gate noise in order to achieve reliable com-
putation was first noticed by von Neumann, who showed that for
<
0
:
0073
, reliable computation is possible using

-noisy, three-input ma-
jority gates. His method was to interleave “computation levels” of the
circuit, i.e., levels that correspond to levels of the original (noiseless)
circuit, with “error-correction levels,” in which three-input majority
Manuscript received October 15, 2002; revised May 22, 2003. The work of
W. S. Evans was supported in part by the Natural Sciences and Engineering
Research Council of Canada. The work of L. J. Schulman was supported in part
by the National Science Foundation and the Charles Lee Powell Foundation.
W. S. Evans is with the Department of Computer Science, University of
British Columbia, Vancouver, BC V6T 1Z4, Canada.
L. J. Schulman is with the California Institute of Technology, Pasadena, CA
91125 USA.
Communicated by G. Battail, Associate Editor At Large.
Digital Object Identifier 10.1109/TIT.2003.818405
gates combine the output of three separate copies of each computa-
tion, in order to obtain an output that is more likely to be correct than
any single copy.
As von Neumann noted, this idea cannot lead to reliable computation
if


1
=
6
. Consider a particular three-input majority gate. If each of
its inputs is incorrect independently with probability
a
, then it in turn
will be incorrect with probability
(1

)(
a
3
+3
a
2
(1
a
))
+

((1
a
)
3
+3
a
(1
a
)
2
)
:
(1)
If
<
1
=
6
, then this value can be smaller than
a
; this amplification is
necessary for von Neumann’s argument to work. However, if


1
=
6
,
then for every
a<
1
=
2
, the error probability of the output is greater
than
a
, i.e., the output of the majority gate is less reliable than its inputs,
and von Neumann’s method fails.
This suggested to von Neumann that perhaps reliable computation
is not possible by

-noisy, three-input gates if


1
=
6
. The first proof
that there is some
<
1
=
2
for which reliable computation by noisy
components is impossible, came in 1988 from Pippenger’s work on
formula depth
1
bounds [2]. He proved that if


1
2
1
2
k
then reliable
computation by formulas is impossible using

-noisy,
k
-input gates.
Soon after, Feder [3] extended this result to general circuits, proving
reliable computation by circuits is impossible if


1
2
1
2
k
. Evans
and Schulman [4] improved this bound to


1
2
1
2
p
k
.
The above mentioned papers developed a certain information-theo-
retic technique, which yielded both the bounds cited, and lower bounds
on noisy circuit depth. However, in 1991, Hajek and Weller used a
completely different technique to prove a tight threshold for reliable
computation by formulas with noisy three-input gates [5], showing that
<
1
=
6
allows reliable computation but


1
=
6
forbids it. In this cor-
respondence, we extend the work of Hajek and Weller to prove a tight
threshold for reliable computation by formulas using noisy
k
-input
gates (
k
odd). The main result of this correspondence is summarized
as follows.
Theorem 1:
For
k
odd and
k
=
1
2
2
k
2
k
k
1
(2)
there exists
<
1
=
2
such that all Boolean functions can be
(1

)
-re-
liably computed by noisy formulas if and only if
<
k
. (Using Stir-
ling’s approximation,
k

1
2
p

2
p
2
k
for large values of
k
.)
Fig. 1 shows how this exact threshold compares to previous bounds
for reliable computation.
II. T
HRESHOLD
V
ALUE
To calculate the threshold for reliable computation using
k
-input
gates, we start by generalizing von Neumann’s expression for the error
probability of a three-input majority gate (1). Let
m
;k
(
a
)= (1

)

k
(
b
k=
2
c
;a
)+

(1

k
(
b
k=
2
c
;a
))
(3)
where

k
(
l;
a
)=
l
i
=0
k
i
(1
a
)
i
a
k
i
:
The value
m
;k
(
a
)
is the probability that an

-noisy,
k
-input majority
gate is incorrect given that its inputs are incorrect independently with
probability
a
.
For
k
=3
,if


1
=
6
then
m
;
3
(
a
)
>a
for all
a
2
[0
;
1
=
2)
. This
is von Neumann’s observation that, if

is large, the output of a noisy,
1
The
depth
of a circuit is the number of gates on the longest path from an
input of the circuit to its output.
0018-9448/03$17.00 © 2003 IEEE
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 11, NOVEMBER 2003
3095
Fig. 1. Bounds for reliable computation.
three-input majority gate is less reliable than its inputs. In this section,
we generalize von Neumann’s observation to noisy,
k
-input gates.
Lemma 1:
For
k
odd,
1) if


k
then
m
; k
(
a
)
>a
for all
a
2
[0
;
1
=
2)
;
2) if
<
k
then there exists

;k
2
[0
;
1
=
2)
such that
m
;k
(

;k
)=

;k
; and
•if
a<
;k
then
m
;k
(
a
)
>a
,
•if
a>
;k
then
m
;k
(
a
)
<a
,
where
k
is defined in (2).
See Fig. 2 for an example of
m
;k
when

=
k
and
<
k
(for
k
=3
).
Proof:
We first prove that for
k
odd,
m
0
;k
(
a
)

a
and
m
;k
(
a
)
>a
for
a
2
[0
;
1
=
2)
. This and the linearity of
m
;k
(
a
)
in

prove the first statement in the lemma.
m
0
;k
(
a
)
is the probability that the noiseless majority of
k
inputs is
incorrect given that each input is incorrect with probability
a
. To show
m
0
;k
(
a
)

a
, we prove the inequality
m
0
;k
(
a
)=
b
k=
2
c
i
=0
k
i
(1
a
)
i
a
k
i

a
k
i
=0
k
i
(1
a
)
i
a
k
i
=
a
which for
k
odd is equivalent to
b
k=
2
c
i
=0
k
i
(1
a
)
i
a
k
i

a
b
k=
2
c
i
=0
k
i
((1
a
)
i
a
k
i
+(1
a
)
k
i
a
i
)
:
This inequality holds term by term since if
a
=0
, all terms are zero
and otherwise, dividing the
i
th term of the right-hand side by the
i
th
term on the left gives
a
1+
1
a
a
k
2
i

a
1+
1
a
a
=1
:
To prove
m
;k
(
a
)
>a
for
a
2
[0
;
1
=
2)
, write
a
=(1
)
=
2
and
let
f
k
(
;
)=
m
;k
((1
)
=
2)
1
2
(i.e.,
f
k
(
;
)
is the difference between the error probability of
the output and the error probability of the inputs). Since for
k
odd
f
k
(
;
0)
=
0
(in particular,
f
k
(
k
;
0)
=
0
), it suffices for the first
Fig. 2. The error probability of an
-noisy, three-input majority gate as a
function of input error probability for
=
and for
.
statement in the lemma to prove
f
k
(
k
;
)
is an increasing function
of
2
(0
;
1]
(i.e.,
df
k
(
k
;
)
=d
>
0
)
df
k
(
;
)
d
=1
=
2+ (1
2

)
d
d

k
b
k=
2
c
;
1
2
:
(4)
Since
d
d

k
b
k=
2
c
;
1
2
=
d
d
b
k=
2
c
i
=0
k
i
1+
2
i
1
2
k
i
=
b
k=
2
c
i
=0
k
i
2
k
i
1+
2
i
1
2
k
i
1
+
b
k=
2
c
i
=0
i
2
k
i
1+
2
i
1
1
2
k
i
=
b
k=
2
c
i
=0
k
2
k
1
i
1+
2
i
1
2
k
i
1
+
b
k=
2
c
1
i
=0
k
2
k
1
i
1+
2
i
1
2
k
i
1
=
k
2
k
k
1
b
k=
2
c
(1
2
)
b
k=
2
c
substituting into (4) yields
df
k
(
;
)
d
=1
=
2
(1
2

)
k
2
k
k
1
k
1
2
(1
2
)
:
(5)
3096
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 11, NOVEMBER 2003
Setting

=
k
df
k
(
k
;
)
d
=
1
2
(1
2
)
2
which is positive for
2
(0
;
1]
.
We now prove the second statement of the lemma. The second
derivative of
f
k
(
;
)
is
d
2
f
k
(
;
)
d
2
=(1
2

)
k
(
k
1)
2
k
k
1
k
1
2
(1
2
)
which is nonnegative for all
2
[0
;
1]
and

2
[0
;
1
=
2]
. Thus,
f
k
(
;
)
is convex in
2
[0
;
1]
. Since
f
k
(
;
0)
=
0
for
k
odd and
f
k
(
;
1)
=

, the convexity of
f
k
(
;
)
will imply the lemma if we can prove
df
k
(
;
)
=d <
0
at
=0
. By (5), for
<
k
df
k
(
;
)
d

1
2
(1
2
)
2
with equality if and only if
=1
, and thus, at
=0
,
df
k
(
;
)
=d <
0
:
III. N
EGATIVE
R
ESULT
Suppose we simply wish to “remember” an input bit for
L
computa-
tion steps—that is, to design a noisy circuit of depth
L
with one input
x
whose output is
x
with high probability. This is a prerequisite for
computing nontrivial functions of many variables. If the computation
components are

-noisy,
k
-input gates, the obvious method is to take
the majority of
k
independent copies of the best circuit for remem-
bering the input bit for
L
1
steps. This construction results in a depth
L
formula of majority gates that has error probability
m
(
L
)
;k
(0)
where
m
(
L
)
;k
is the
L
-fold composition of
m
;k
. By Lemma 1 , if


k
, this
technique will not work for arbitrarily large
L
. In fact, for
k
odd
if


k
the n
li m
L
!1
m
(
L
)
;k
(0)
=
1
=
2
:
This is the intuitive reason why
k
(which is derived from the behavior
of noisy,
k
-input majority gates) is the noise threshold for computation
using arbitrary noisy
k
-input gates.
To derive a precise statement from this intuition, we prove that if


k
then, for any fixed
<
1
=
2
, there are Boolean functions that
cannot be computed by formulas with error probability

. In particular,
Theorem 2 implies that for sufficiently large
n
, no function that de-
pends
2
on
n
variables can be computed with error probability

.
Theorem 2:
For
k
odd, if


k
then any formula using

-noisy,
k
-input gates for computing a Boolean function that depends on at least
k
L
1
+1
variables errs with probability

m
(
L
)
;k
(0)
on some input.
Note that this implies reliable computation is impossible if


k
(since
lim
L
!1
m
(
L
)
;k
(0)
=
1
=
2
).
Proof:
Let
f
be a Boolean function that depends on at least
k
L
1
+1
variables. Let
F
be a formula for
f
composed of

-noisy,
k
-input gates. Since
f
depends on
k
L
1
+1
variables, there exists
some variable
x
that is an input only to gates at layers
3

L
in
F
.
Thus, any path from the input
x
to the output of the formula must
pass through at least
L
gates. Fix the inputs other than
x
so that either
f
=
x
or
f
=1
x
; without loss of generality say
f
=
x
. Let
F
x
be
the formula
F
after the inputs other than
x
have been fixed as above.
Consider the two conditional probabilities
P
P
P
[
F
x
=1
j
x
=0]
and
P
P
P
[
F
x
=0
j
x
=1]
:
2
A function depends on an argument
if there exists a setting of the other
arguments such that the function restricted to that setting is not a constant.
3
The layer of a gate is the number of gates on the path from its input to the
output of the formula.
Fig. 3. Contraction of
(
)
(light gray) to
(
(
))
(dark gray) caused
by one noisy gate.
The maximum of these two quantities is a lower bound on the error
probability of
F
.
Following Hajek and Weller, one may view these conditional prob-
abilities geometrically as the point
(
P
P
P
[
F
x
=1
j
x
=0]
;P
P
P
[
F
x
=0
j
x
=
1])
in the unit square. In general, if
Y
is a Boolean random variable jointly
distributed with
x
, let

Y
=(

Y
0
;
Y
1
)= (
P
P
P
[
Y
=1
j
x
=0]
;P
P
P
[
Y
=0
j
x
=
1])
:
For example, the

-noisy,
k
-input majority gate with all in-
puts equal to
x
produces an output
Y
described by the point

Y
=(
m
;k
(0)
;m
;k
(0))
=
(
;
)
. In this case, the probability that
Y
differs from
x
is

.
The gate whose output is
F
x
(the top gate in the formula) does not
receive
x
directly as input. The value of
x
must pass through at least
L
1
noisy gates to reach this top gate. Each gate adds noise to the
value of
x
, but the computation performed by the gate may compensate
for this noise.
We show that if


k
then each gate cannot compensate for the
added noise. In fact, the space of points

Y
, describing possible distri-
butions at the gate’s output, contracts as we pass
x
through more and
more noisy gates. In particular, let
S
(
a
)
be the convex hull of the points
f
(0
;
1)
;
(1
;
0)
;
(
a;a
)
;
(1
a;
1
a
)
g
. We prove (Lemma 2 ) that if the
inputs to an

-noisy,
k
-input gate are described by points in
S
(
a
)
, then
the output must lie in
S
(
m
;k
(
a
))
; see Fig. 3.
Using this lemma, we prove by induction on
L
that the point de-
scribing the output of
F
x
lies within
S
(
m
(
L
)
;k
(0))
. This establishes the
theorem since any random variable
Y
whose point lies in
S
(
a
)
differs
from
x
with probability at least
a
. Thus, the error probability of
F
x
is
at least
m
(
L
)
;k
(0)
.
For
L
=1
, the formula consists of at least one gate. The points
describing inputs to the top gate of the formula
F
x
lie within
S
(0)
(trivially) and thus, by Lemma 2 , the point describing the output lies
within
S
(
m
;k
(0))
.
For
L>
1
, the formula consists of a top gate with at most
k
inputs.
Each of these inputs is either constant with respect to
x
or the output
of a formula in which
x
is an input to gates at layers

L
1
.In
the first case, the point describing the input lies within
S
(
a
)
for all
a
. In the second, the point describing the input lies in
S
(
m
(
L
1)
;k
(0))
by induction. Thus, by Lemma 2 , the point describing the output lies
within
S
(
m
(
L
)
;k
(0))
.
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 11, NOVEMBER 2003
3097
IV. C
ONTRACTION OF
S
(
a
)
Lemma 2:
If


k
and

Y
;
Y
;
...
;
Y
2
S
(
a
)
with
a
2
[0
;
1
=
2]
then for all

-noisy,
k
-input gates
g
with inputs
Y
1
;Y
2
;
...
;Y
k
and
output
Y
,

Y
2
S
(
m
; k
(
a
))
.
Proof:
We will prove in Lemmas 3 and 4 that we may assume
that

Y
=(
a;
a
)
for all
i
and that
g
is an

-noisy,
k
-input threshold
gate. A
k
-input threshold gate outputs
1
if and only if the number of
inputs equal to
1
is at least
t
. The threshold
t
is an integer between
0
and
k
inclusive.
We prove that the output
Y
of the gate
g
has

Y
2
S
(
m
;k
(
a
))
.By
symmetry, we need only consider those threshold gates
g
with threshold
t
d
k=
2
e
. We will prove that

Y
lies within the convex hull of the
vertices
f
(0
;
1)
;
(1
=
2
;
1
=
2)
;
(
m
;k
(
a
)
;m
;k
(
a
))
g
. Since
t
d
k=
2
e
,

Y
0


Y
1
. Also,
a

1
a
implies

Y
0
+

Y
1

1
. Thus, we need
only prove that

Y
0
+
m
;k
(
a
)(

Y
1

Y
0
)

m
;k
(
a
)
(6)
when


k
.
Let
V
be the pre-noise
4
output of gate
g
. That is,

Y
b
=

+(1
2

)

V
b
for
b
2f
0
;
1
g
. Then (6) becomes

V
0
+
m
;k
(
a
)(

V
1

V
0
)


k
(
b
k=
2
c
;a
)
:
Since
g
is a
k
-input, threshold
t
gate,

V
0
=

k
(
k
t;
a
)
and

V
1
=

k
(
t
1
;a
)
:
Since


k
implies
m
;k
(
a
)
>a
, it suffices to prove

V
0
+
a
(

V
1

V
0
)


k
(
b
k=
2
c
;a
)
or
a

V
1

k
(
b
k=
2
c
;a
)

(1
a
)

k
(
b
k=
2
c
;a
)

V
0
:
Substituting the values of

k
(
b
k=
2
c
;a
)
,

V
0
, and

V
1
yields
5
a
(

k
(
t
1
;a
)

k
(
b
k=
2
c
;a
))

(1
a
)(

k
(
b
k=
2
c
;a
)

k
(
k
t;
a
))
:
After expanding each side as a summation, the inequality holds term
by term since
a
2
[0
;
1
=
2]
and
i
d
k=
2
e
imply
a
k
i
(1
a
)
i
a
k
i

(1
a
)
k
k
i
(1
a
)
k
i
a
i
:
V. R
EDUCTION
L
EMMAS
The preceding proof relies on two lemmas that are rather straight-
forward extensions of similar lemmas for
k
=3
given by Hajek and
Weller [5].
An

-noisy,
k
-input gate
g
takes as input
Y
1
;
...
;Y
k
, described by
points

Y
;
...
;
Y
, and outputs
Y
described by

Y
. Thus, the gate
g
defines a mapping
g
:[0
;
1]
2
k
!
[0
;
1]
2
. Lemma 2 states that if


k
, then the union over all
g
of
g
(
S
(
a
)
k
)
is contained in
S
(
m
;k
(
a
))
. The purpose of the following two lemmas is to show
that it suffices to prove that the union over all threshold gates
g
of
g
((
a;
a
)
k
)
is contained in
S
(
m
;k
(
a
))
. (Note:
(
a;
a
)
k
is the point
(
a;
a
)
;
(
a;
a
)
;
...
;
(
a;
a
)
in
[0
;
1]
2
k
.) The method is to show that the
set of image points has the same convex hull in both cases. Thus, since
S
(
m
;k
(
a
))
is convex, showing containment of either set implies
containment of the other.
4
An
-noisy gate computes a Boolean function of its inputs that is then com-
plemented with probability
to become the gate’s output. The value computed
by the gate prior to the probabilistic change is the gate’s
pre-noise
output.
5
If
=
2
both sides of the inequality are zero.
Lemma 3:
If
C
is the convex hull of the union over all
g
of
g
(
S
(
a
)
k
)
and
C
a
is the convex hull of the union over all
g
of
g
((
a;
a
)
k
)
then
C
=
C
a
:
Proof:
The mapping from
S
(
a
)
k
to
[0
;
1]
2
defined by
g
is affine,
[0
;
1]
2
!
[0
;
1]
2
, in each

Y
when the others are fixed. Thus, the
image of
S
(
a
)
k
is contained in the convex hull of the image of the set
of vertices of
S
(
a
)
k
. Each vertex is of the form
(

Y
;
Y
;
...
;
Y
)
with

Y
2f
(1
;
0)
;
(0
;
1)
;
(
a;
a
)
;
(1
a;
1
a
)
g
:
If

Y
2f
(1
;
0)
;
(0
;
1)
g
, then the same value of

Y
can be obtained
with

Y
=(
a;
a
)
by modifying the gate
g
to ignore the value of
Y
i
.
Similarly, if

Y
=(1
a;
1
a
)
then the same value of

Y
can be
obtained with

Y
=(
a;
a
)
by modifying the gate
g
to negate input
Y
i
.
The lemma follows.
Lemma 4:
If
C
a
is the convex hull of the union over all
g
of
g
((
a;
a
)
k
)
and
C
a;t
is the convex hull of the union over threshold
gates
g
of
g
((
a;
a
)
k
)
then
C
a
=
C
a;t
:
Proof:
Note that

Y
=(
a;
a
)
for all
i
. To establish the lemma, it
suffices to prove that for any constants
r
and
s
,
r
Y
0
+
s
Y
1
is minimized
when
g
is some threshold function.
Again, let
V
be the pre-noise output of gate
g
,so

Y
b
=

+(1
2

)

V
b
for
b
2f
0
;
1
g
. Thus, to minimize
r
Y
0
+
s
Y
1
, we minimize
r
V
0
+
s
V
1

V
0
=
(
Y
;Y
;
...
;Y
)
2
S
P
P
P
[(
Y
1
;Y
2
;
...
;Y
k
)
j
x
=0]

V
1
=
(
Y
;Y
;
...
;Y
)
2
S
P
P
P
[(
Y
1
;Y
2
;
...
;Y
k
)
j
x
=1]
where
S
b
is the set of
k
-bit vectors representing inputs for which
V
=
b
. A gate
g
that minimizes
r
V
0
+
s
V
1
has
(
Y
1
;Y
2
;
...
;Y
k
)
2
S
1
if
and only if
rP
P
P
[(
Y
1
;Y
2
;
...
;Y
k
)
j
x
=0]
<sP
P
P
[(
Y
1
;Y
2
;
...
;Y
k
)
j
x
=1]
:
From the fact that

Y
=(
a;
a
)
P
P
P
[(
Y
1
;Y
2
;
...
;Y
k
)
j
x
=0]=
a
t
(1
a
)
k
t
P
P
P
[(
Y
1
;Y
2
;
...
;Y
k
)
j
x
=1]=
a
k
t
(1
a
)
t
where
t
is the number of ones in the vector
(
Y
1
;Y
2
;
...
;Y
k
)
. Thus, the
relation
rP
P
P
[(
Y
1
;Y
2
;
...
;Y
k
)
j
x
=0]
<sP
P
P
[(
Y
1
;Y
2
;
...
;Y
k
)
j
x
=1]
holds monotonically in
t
and the lemma follows.
VI. P
OSITIVE
R
ESULT
The preceding section shows that the ability of an

-noisy,
k
-input
majority gate to decrease error probability is necessary for reliable
computation using
k
-input gates. (Put another way, reliable compu-
tation is not possible unless a bit can be maintained indefinitely in a
formula by repeatedly taking majority.) In this section, we show that
this is also a sufficient condition.
For
<
k
, we prove that there exists
<
1
=
2
such that given any
Boolean function, we can construct a formula using

-noisy,
k
-input
gates that
(1

)
-reliably computes the function. One obvious idea is
to use von Neumann’s technique of taking the noisy majority of
k
in-
dependent copies of a computation in order to decrease the error prob-
ability. This process can be repeated to decrease the error probability
still further, but there is a limit. It will decrease the error probability if