arXiv:1210.1810v2 [quant-ph] 25 Nov 2012
Fully device independent quantum key distribution
Umesh Vazirani
∗
Thomas Vidick
†
Abstract
The laws of quantum mechanics allow unconditionally secure
key distribution protocols. Neverthe-
less, security proofs of traditional quantum key distribut
ion (QKD) protocols rely on a crucial assump-
tion, the trustworthiness of the quantum devices used in the
protocol. In device-independent QKD, even
this last assumption is relaxed: the devices used in the prot
ocol may have been adversarially prepared,
and there is no a priori guarantee that they perform accordin
g to specification. Proving security in this
setting had been a central open problem in quantum cryptogra
phy.
We give the first device-independent proof of security of a pr
otocol for quantum key distribution that
guarantees the extraction of a linear amount of key even when
the devices are subject to a constant rate of
noise. Our only assumptions are that the laboratories in whi
ch each party holds his or her own device are
spatially isolated, and that both devices, as well as the eav
esdropper, are bound by the laws of quantum
mechanics. All previous proofs of security relied either on
the use of many independent pairs of devices,
or on the absence of noise.
1 Introduction
Quantum key distribution [BB84, Eke91] together with its pr
oof of security [May01, SP00] appeared to
have achieved the holy grail of cryptography — unconditiona
l security, or a scheme whose security was
based solely on the laws of physics. However, practical impl
ementations of QKD protocols necessarily
involve imperfect devices [BBB
+
92, MHH
+
97], and it was soon realized that these imperfections could
be
exploited by a malicious eavesdropper to break the “uncondi
tional” security of QKD (see e.g. [SK09] for a
review).
Mayers and Yao [MY98] put forth a vision for restoring uncond
itional security in the presence of im-
perfect or even maliciously designed devices, by subjectin
g them to tests that they fail unless they behave
consistently with “honest” devices. The fundamental chall
enge they introduced was of
device-independent
quantum key distribution
(DIQKD): establishing the security of a QKD protocol based o
nly on the validity
of quantum mechanics, the physical isolation of the devices
and the passing of certain statistical tests. The
germ of the idea for device-independence may already be seen
in Ekert’s original entanglement-based pro-
tocol for QKD [Eke91], and was made more explicit by Barrett,
Hardy, and Kent [BHK05], who showed
how to generate a single random bit secure against any non-si
gnalling eavesdropper. A long line of re-
search on DIQKD seeks to make the qualitative argument from [
BHK05] quantitative, devising protocols
that extract an amount of key that is linear in the number of us
es of the devices, and is secure against in-
creasingly general eavesdropping strategies. Initial wor
ks [AGM06, AMP06, SGB
+
06] give efficient and
∗
Department of Computer Science, UC Berkeley, California. S
upported by ARO Grant W911NF-09-1-0440 and NSF Grant
CCF-0905626. Email
vazirani@eecs.berkeley.edu
†
Computer Science and Artificial Intelligence Laboratory, M
assachusetts Institute of Technology. Supported by NSF Gra
nt
0844626. Email
vidick@csail.mit.edu
.
1
noise-tolerant protocols that are secure against individu
al attacks by non-signalling eavesdroppers. Sub-
sequent work [MRC
+
09, Mas09] and [HRW10] also proved security against collect
ive attacks. Other
works [ABG
+
07, PAB
+
09, MRC
+
09, HR10, MPA11] obtain better key rates under the stronger a
ssumption
that the eavesdropper is bound by the laws of quantum mechani
cs. All these results, however, could only be
established under restrictive
independence
assumptions on the devices, e.g. in recent work [HR10, MPA11
]
a proof of security based on collected statistics requires t
hat the
n
uses of each device are causally indepen-
dent: measurements performed at successive steps of the pro
tocol commute with each other.
Very recently two papers [BCK12b, RUV12] announced proofs o
f security of DIQKD without requir-
ing any independence assumption between the different uses
of the devices. Unfortunately, although the
approaches in [BCK12b, RUV12] are very different both impli
ed protocols are polynomially inefficient and
unable to tolerate noisy devices. The protocol used in [BCK1
2b] is very similar to the one originally intro-
duced in [BHK05], and requires a large number of uses of a pair
of noise-free devices in order to generate
a single bit of key. In the case of [RUV12], DIQKD is obtained a
s a corollary of very strong testing that
allows the shared quantum state and operators of the two untr
usted devices to be completely characterized.
It is an open question whether such strong testing can be achi
eved in a manner that is robust to noise.
A major issue in QKD is dealing with the noise inherent in even
the best devices. Indeed, a good
DIQKD protocol should differentiate devices that are “hone
st but noisy” from devices that may attempt to
take advantage of the protocol’s necessary noise tolerance
in order to leak information to an eavesdropper
by introducing correlations in their “errors” [BCK12a]. Th
e protocols in [BCK12b, RUV12] do not achieve
this, since they cannot tolerate any constant noise rate. Th
is raises the question: is device-independent QKD
even
possible
without independence assumptions in a realistic, noise-to
lerant scenario?
1.1 Results
We answer this question in the affirmative by giving the first c
omplete device-independent proof of security
of quantum key distribution that tolerates a constant noise
rate and guarantees the generation of a linear
amount of key. Our only assumption on the devices is that they
can be modeled by the laws of quantum
mechanics, and that they are spatially isolated from each ot
her and from any adversary’s laboratory. In
particular, we emphasize that the devices may have quantum m
emory. While the proof of security is quite
non-trivial (it builds upon ideas from the work on certifiabl
e randomness generation mentioned below), the
actual protocol whose device independence properties we es
tablish is quite simple. It is a small variant of
Ekert’s entanglement-based protocol [Eke91].
In the protocol, the users Alice and Bob make
m
successive uses of their respective devices. At each
step, Alice (resp. Bob) privately chooses a random input
x
i
∈ {
0, 1, 2
}
(resp.
y
i
∈ {
0, 1
}
) for her device,
collecting an output bit
a
i
(resp.
b
i
). If the devices were honestly implemented they would share
Bell states
|
ψ
i
=
1/
√
2
|
00
i
+
1/
√
2
|
11
i
, and measure their qubits according to the following strate
gy: if
x
i
=
0
measure in the computational basis, if
x
i
=
1
measure in the Hadamard basis and if
x
i
=
2
measure in the
3
π
/8
-rotated basis. If
y
i
=
0
measure in the
π
/8
-rotated basis and if
y
i
=
1
measure in the
3
π
/8
-rotated
basis.
To test the devices, after the
m
steps have been completed, the users select a random subset
B
⊆
{
1, . . . ,
m
}
of size
|
B
|
=
γ
m
, where
γ
>
0
is a small constant, and publicly announce their inputs and
outputs in
B
. Rounds in
B
will be called “Bell rounds”. Let
z
i
=
1
if and only if
a
i
6
=
2
and
a
i
⊕
b
i
6
=
x
i
∧
y
i
,
or
(
a
i
,
b
i
) = (
2, 1
)
and
a
i
6
=
b
i
. The users jointly compute the noise rate
η
:
= (
1/
|
B
|
)
∑
i
∈
B
z
i
−
(
1
−
opt
)
,
where
opt
= (
2 cos
2
π
/8
+
1
)
/3
.
1
If
η
≥
0.5%
, say, they abort. If not, they announce their remaining
1
This corresponds to estimating the average amount by which t
he devices’ outputs in
B
differ from a maximal violation of a
2
input choices. Let
C
⊆ {
1, . . . ,
m
}
be the steps in which
(
a
i
,
b
i
) = (
2, 1
)
. We will call the rounds in
C
the
“check rounds”; outputs from the rounds
C
−
B
constitute the raw key. The users conclude by performing
standard information reconciliation and privacy amplifica
tion steps, extracting a key of length
κ
m
for some
κ
=
κ
(
η
,
ε
)
, where
ε
is the desired security parameter. (We refer to Figures 1 and
2 for a more detailed
description of the protocol.)
Theorem 1
(Informal)
.
Let
m
be a large enough integer and
ε
=
2
−
c
0
m
, where
c
0
>
0
is a small constant.
Given any pair of spatially isolated quantum devices
A
and
B
, the protocol described above generates a
shared key
K
of length
κ
m
, where
κ
≈
1.4%
, that is
ε
-secure: the probability that the users Alice and Bob
do not abort and that the adversary can obtain information ab
out the key is at most
ε
.
This informal statement hides a tradeoff between the parame
ters
ε
,
η
, and
κ
: the larger the security
parameter
ε
and the smaller the noise rate
η
, the higher the key rate
κ
. As
η
→
0
(provided
ε
is chosen large
enough) our proof guarantees a secure key rate
κ
≈
2.5%
, which with our setting of parameters corresponds
to about
15%
of the raw key. Conversely, the maximum noise rate for which w
e may extract a key of positive
length is
η
max
≈
1.2%
. This is worse than the optimal key rates obtained under the c
ausal independence
assumption [MPA11], but still quite reasonable.
1.2 Proof overview and techniques
We start with the observation that the randomness in the shar
ed secret key must necessarily be generated by
the two devices. Indeed, even though the users have the abili
ty to generate perfect random bits privately,
such bits cannot be used directly for the shared key, since an
y information transmitted about them is also
available to the adversary. It follows that a necessary cond
ition for DIQKD is that the users should be
able to use their untrusted devices to generate
certified
randomness — randomness they can guarantee was
not pre-encoded in the devices by the adversary, nor obtaine
d as some function of the users’ inputs to the
devices.
Luckily, the possibility of generating certified randomnes
s has already been investigated. Building on an
observation made in [Col06], Pironio et al. [PAM
+
10] devised a protocol in which the generation of random-
ness could be certified solely by testing for a sufficiently la
rge Bell inequality violation. In [FGS11, PM11]
it was further shown that the randomness generated was secur
e against an arbitrary classical adversary. Con-
currently, in [VV11] we gave a protocol that was secure even a
gainst a quantum adversary. This last protocol
provides us with a solid starting point for DIQKD, since our g
oal is to prove that the quantum adversary,
who may have fabricated the two devices, has no information a
bout the shared random key. Nevertheless,
extending this to DIQKD presents us with some serious new cha
llenges.
1. First, QKD is a task that involves two distant parties Alic
e and Bob. Any classical communication
between Alice and Bob must take place in the clear and is there
fore accessible to the adversary, thus
giving her additional power.
2. Second, in order to achieve QKD it is not sufficient just to g
enerate randomness — the point of QKD is
that Alice and Bob share the same random key. In our protocol t
his is accomplished by distinguishing
two different types of rounds: Bell rounds, in which the viol
ation of the CHSH inequality by the
devices is estimated, and check rounds, in which the devices
are supposed to produce identical outputs
from which the key will be generated. Unfortunately Alice an
d Bob must exchange information about
which rounds are which, and since the adversary has access to
all communicated classical information,
Bell inequality based on the CHSH inequality [CHSH69, BC90]
: see Section 2 for details.
3
this appears to render the Bell rounds pointless, since the a
dversary can ignore the Bell rounds and
attack only those rounds which are used to generate the key (t
he check rounds).
3. Finally, to be practical the protocol should tolerate noi
sy devices. As a result, the users can only
expect a non-maximal amount of correlation, both in the Bell
and check rounds. The randomness-
certification protocol from [VV11] did not tolerate any nois
e — in fact, the absence of noise played
a crucial role in the proof. As we already explained in the int
roduction, dealing with the presence of
noise is one of the major conceptual and technical hurdles of
the proof.
We now explain how our proof technique addresses these chall
enges. The proof proceeds in two steps.
As a first step, we argue that the following three conditions c
annot hold simultaneously in any single round
of the protocol: (i) the devices violate the CHSH inequality
, whenever the round was selected as a Bell
round (ii) the adversary can predict Bob’s output, whenever
the round was selected as a check round, and
(iii) the no-signalling condition is satisfied between all t
hree parties (Alice, Bob and the adversary). To
derive a contradiction from (i)–(iii) we use a simple concep
tual tool called the “guessing game”, which was
introduced in [VV11]. The main idea is that conditions (i) an
d (ii) imply that the adversary and Alice will
be able to team up to predict Bob’s output from their sole resp
ective input/output behavior, violating the
no-signalling condition (iii).
The second step is more challenging. All previous works on th
e subject reduced the general setting to a
single-round scenario similar to the one outlined above by r
equiring some form of independence assumption
on the devices or on the adversary’s attack. We do not use any s
uch assumption, and the main challenge is
to deal with correlations between all rounds and the adversa
ry in order to perform the reduction.
Our starting point is the existence of a pair of devices that p
ass the protocol with non-negligible prob-
ability, but such that the adversary may gain non-negligibl
e information about the secret key generated at
the end of the protocol. Our goal is to show the existence of a r
ound
i
0
of the protocol in which conditions
(i)–(iii) above are satisfied, thus deriving a contradictio
n.
Our argument has two main ingredients. The first ingredient i
s the so-called “quantum reconstruction
paradigm”, a technique that was introduced in [DV10] and fur
ther developed in [DPVR12, VV11]. What
this achieves is the following: any adversary able to obtain
non-negligible information about the generated
key can be transformed into a seemingly much stronger advers
ary: she can
predict
the entire string of
outputs of Bob’s device on the check rounds (the rounds used t
o generate the key). Furthermore, the success
probability of this “guessing measurement” is of the same or
der as the original distinguishing probability
but does not depend on the length of the key — a fact that will be
crucial to obtaining good parameters. In
order to achieve this, the new adversary requires access to t
he same public information as the original one,
together with a small number of additional “advice bits” tak
en from Bob’s string of outputs.
This stronger form of the adversary guarantees that conditi
on (ii) above holds in all rounds with small
but non-negligible probability. Furthermore, the checkin
g performed as part of the protocol ensures that
(i) also holds on average over all rounds, with probability o
f the same order. The natural idea in order to
identify a round
i
0
in which conditions (i) and (ii) hold simultaneously with hi
gh probability is to perform
conditioning: there must exist many rounds
i
such that, provided both conditions hold in rounds
1
to
i
−
1
,
they must hold in round
i
with high probability.
Such conditioning, however, presents a new difficulty: it ma
y introduce such correlations that condition
(iii) is no longer satisfied. Indeed, recall that one of the ma
in difficulties in analyzing the QKD protocol is
that the adversary has considerable power, due to the large a
mount of public information that is leaked by the
protocol — including the users’ complete choice of inputs. H
ence conditioning on a low probability event
involving the outcome of a measurement performed by the adve
rsary on her system introduces correlations
4
between inputs in all rounds. For instance, this conditioni
ng could very well force the inputs in round
i
0
to
be a particular pair, say
(
0, 0
)
, making the guarantees (i) and (ii) all but useless.
The difficulty is reminiscent of one encountered in the analy
sis of parallel repetition, where conditioning
on success in a subset of the parallel repeated games may intr
oduce correlations among the players in the
remaining games. Here, the situation is further complicate
d by the fact that it involves three parties involved
in a relatively complex interaction. In particular, the con
ditioning is performed jointly on an event involving
Alice and Bob (the CHSH violation observed in previous round
s being sufficiently large) on the one hand,
and Bob and Eve (Eve’s guess being correct) on the other.
The final step in our proof consists in bounding the amount of c
orrelation introduced by the conditioning.
For this we use tools from information theory, including the
chain rule for mutual information and the
quantum Pinsker’s inequality, which had not previously bee
n applied to this setting. (Similar tools were
already used by Holenstein in his derivation of a parallel re
petition theorem for the case of two-player
games with no-signalling players [Hol09].)
1.3 Perspective
We have not attempted to optimize the relationship between t
he parameters
κ
,
η
and
ε
describing the key
rate, the noise rate and the security parameter respectivel
y, and it is likely that the explicit dependency
stated in Theorem 8 can be improved by tightening our argumen
ts. It is an interesting question to find out
whether our approach can lead to a trade-off as good as the one
that has been shown to be achievable under
additional assumptions on the devices [MPA11]. One possibi
lity for improvement would be to bias the
users’ input distribution towards the pair of inputs
(
2, 1
)
from which the raw key is extracted, as was done
in e.g. [AMP06]: indeed, only a very small fraction of the rou
nds are eventually required to estimate the
violation of the CHSH condition.
Our proof crucially makes use of quantum mechanics to model t
he devices and the adversary. Can one
obtain a fully device-independent proof of security of QKD a
gainst adversaries that are only restricted by
the no-signalling principle? Barrett et al. [BCK12b] recen
tly showed that such security is achievable in
principle; however their protocol is highly inefficient and
does not tolerate noisy devices.
Organization of the paper.
We start with some preliminaries in Section 2, introducing o
ur notation, the
information-theoretic quantities that will be used. We als
o summarize the main parameters of our protocol,
which is described in Figures 1 and 2. In Section 3 we formally
state our result and outline the security proof.
The two main ingredients are the analysis of Protocol B, whic
h is given in Section 4, and the “quantum
reconstruction paradigm” introduced in Section 5. Finally
, Section 6 contains probabilistic and information-
theoretic lemmas used in some of the proofs.
Acknowledgments.
We thank Anthony Leverrier for many useful comments on a prel
iminary version of
this manuscript.
2 Preliminaries
We assume familiarity with basic concepts and standard nota
tion in quantum information, including den-
sity matrices and distance measures such as the trace distan
ce and the fidelity. We refer the reader to the
books [NC00, Wil11] for detailed introductions.
5
Notation.
We use roman capitals
A
,
B
, . . . ,
X
both to refer to random variables and the registers, classic
al
or quantum, that contain them. Calligraphic letters
A
,
B
, . . . ,
X
are used to refer to the underlying Hilbert
space.
D
(
X
)
denotes the set of density operators (non-negative matrice
s with trace
1
) on
X
. For an arbitrary
matrix
A
on
X
we let
k
A
k
1
=
Tr
√
AA
†
denote its Schatten
1
-norm.
ln
denotes the natural logarithm and
log
the logarithm in base
2
. For
x
∈
[
0, 1
]
,
H
(
x
) =
−
x
log
x
−
(
1
−
x
)
log
(
1
−
x
)
is the binary entropy
function.
Information theoretic quantities.
Given a density matrix
ρ
∈
D
(
A
)
, its von Neuman entropy is
H
(
ρ
)
:
=
−
Tr
(
ρ
ln
ρ
)
. For a classical-quantum state
ρ
X A
=
∑
x
p
x
|
x
ih
x
| ⊗
ρ
x
∈
D
(
X ⊗ A
)
, where for every
x
,
ρ
x
∈
D
(
A
)
, the conditional entropy is defined as
H
(
A
|
X
)
ρ
:
=
∑
x
p
x
H
(
ρ
x
)
. Given a state
ρ
ABX
, where
X
is classical, the conditional mutual information is
I
(
A
:
B
|
X
)
ρ
:
=
H
(
A
|
X
)
ρ
+
H
(
B
|
X
)
ρ
−
H
(
AB
|
X
)
ρ
.
We will use the following quantum analogue of the classical P
insker’s inequality (see e.g. Theorem 11.9.1
in [Wil11] for a proof): for any
ρ
AB
∈
D
(
AB
)
,
∥
∥
ρ
AB
−
ρ
A
⊗
ρ
B
∥
∥
2
1
≤
(
2 ln 2
)
I
(
A
:
B
)
ρ
.
(1)
The most important information measure in our context is the
quantum conditional min-entropy, first intro-
duced in [Ren05], and defined as follows.
Definition 2.
Let
ρ
AB
be a bipartite density matrix. The
min-entropy
of
A
conditioned on
B
is defined as
H
min
(
A
|
B
)
ρ
:
=
max
{
λ
∈
R
:
∃
σ
B
∈
D
(
B
)
s.t. 2
−
λ
Id
A
⊗
σ
B
≥
ρ
AB
}
.
We will often drop the subscript
ρ
when there is no doubt about the underlying state. The smooth
min-entropy is defined as follows.
Definition 3.
Let
ε
≥
0
and
ρ
AB
a bipartite density matrix. The
ε
-smooth min-entropy
of
A
conditioned on
B
is defined as
H
ε
min
(
A
|
B
)
ρ
:
=
max
̃
ρ
AB
∈
B
(
ρ
AB
,
ε
)
H
min
(
A
|
B
)
̃
ρ
,
where
B
(
ρ
AB
,
ε
)
is a ball of radius
ε
around
ρ
AB
.
2
The CHSH condition.
The security of our DIQKD protocol is based on the statistica
l verification that
the pair of devices used have an input/output behavior consi
stent with certain pre-determined correlations,
which are those expected of a “honest” quantum-mechanical p
air of devices performing the measurements
described below.
Let
A
and
B
designate two spatially isolated devices. In the protocol,
there are three possible choices of
inputs
x
∈ {
0, 1, 2
}
to
A
, and two possible inputs
y
∈ {
0, 1
}
to
B
. Each of the
6
possible pairs of inputs is
chosen with uniform probability
1/6
. The devices are required to produce outputs
a
,
b
∈ {
0, 1
}
respectively.
The users select a random subset of the rounds of the protocol
in which to evaluate the frequency with which
the following constraints are satisfied. In case both inputs
were in
{
0, 1
}
, the constraint on the outputs is the
CHSH parity constraint
a
⊕
b
=
x
∧
y
[CHSH69]. If the inputs are
(
2, 1
)
the constraint is that the outputs
2
Theoretically any distance measure could be used to define an
ε
-ball. As has become customary, we use the
purified distance
,
P
(
ρ
,
σ
)
:
=
√
1
−
F
(
ρ
,
σ
)
2
, where
F
(
·
,
·
)
is the fidelity.
6
(
a
,
b
)
should satisfy
a
⊕
b
=
0
. Finally, for the remaining pair of inputs
(
2, 0
)
all pairs of outputs are valid.
We will refer to this set of constraints collectively as “the
CHSH condition”. We note that the underlying
Bell inequality is similar to the so-called “chained inequa
lity” for two inputs [BC90].
Let
opt
be the maximum probability with which any two isolated devic
es, obeying the laws of quan-
tum mechanics, may produce outputs satisfying the CHSH cond
ition. It is not hard to show that
opt
=
(
2/3
)
cos
2
π
/8
+ (
1/3
)
, which is achieved using the following strategy. The device
s are initialized in
a single EPR pair
|
Ψ
i
= (
|
00
i
+
|
11
i
)
/
√
2
, each device holding one qubit. On input
0
,
A
performs a
measurement in the computational basis, and on input
1
it measures in the Hadamard basis. On input
0
,
B
measures in the computational basis rotated by
π
/8
. If
A
gets input
2
, or if
B
gets input
1
, they measure in
the computational basis rotated by
3
π
/8
. The devices may be used repeatedly, and honest devices perf
orm
measurements on a fresh EPR pair at each use.
Parameters.
For convenience, we summarize here the main parameters of th
e key distribution protocol
described in Figures 1 and 2.
•
m
is the total number of rounds in the protocol (in each round, a
n input to each of
A
,
B
is chosen, and
an output is collected).
•
B
are the “Bell rounds”, selected to perform parameter estima
tion. They are chosen uniformly at
random under the constraint that
|
B
|
=
γ
m
, for some
γ
>
0
specified in the protocol.
•
η
is the tolerated error rate: the protocol aborts as soon as th
e fraction of rounds in
B
satisfying the
CHSH condition is lower than
opt
−
η
.
•
C
⊆
[
m
]
are the “check rounds”. Those are rounds in which the inputs t
o
(
A
,
B
)
are
(
2, 1
)
. Since the
inputs are chosen uniformly at random, the number of check ro
unds
|
C
|
is highly concentrated around
m
/6
.
•
The target min-entropy rate
κ
. This is the rate of min-entropy that the users Alice and Bob e
xpect to
be present in the check rounds, provided the protocol did not
abort. Once information reconciliation
and privacy amplification have been performed, a secret key o
f length roughly
(
κ
−
H
(
2
η
))
|
C
|
will
be produced.
•
ε
is the security parameter: the statistical distance from un
iform of the extracted key (conditioned on
the eavesdropper’s side information). Precisely, if
K
denotes the system containing the extracted key,
we will obtain that
k
ρ
K
E
′
−
ρ
U
K
⊗
ρ
E
′
k
1
≤
ε
, where
E
′
is a register containing all the side information
available to an arbitrary quantum eavesdropper in the proto
col, and
ρ
U
K
is the totally mixed state on
as qubits as the key length.
3 Analysis of the key distribution protocol
The analysis of Protocol A, and the proof of Theorem 1, is perf
ormed in two steps. The first, main step
consists in proving a lower bound on the quantum smooth condi
tional min-entropy
H
ε
min
(
B
C
|
XYA
B
B
B
E
)
of the outputs obtained by Bob in the check rounds
C
(conditioned on the protocol not aborting). This lower
bound will depend on the maximal error rate
η
that is tolerated by the users in the sub-protocol B (see Fig-
ures 1 and 2 for a description of protocols A and B respectivel
y). Here the lower bound is taken conditioned
on the state of an arbitrary quantum adversary (whom we will c
all Eve and refer to indiscriminately as “the
7
Protocol A
1. Let
m
and
ε
,
η
>
0
be parameters given as input. Let
C
γ
be the constant from Theorem 8, and set
γ
= (
C
γ
/
η
2
)
ln
(
1/
ε
)
/
m
.
2. Alice and Bob run Protocol B for
m
steps, choosing inputs
x
∈ {
0, 1, 2
}
m
(resp.
y
∈ {
0, 1
}
m
) and
obtaining outcomes
a
∈ {
0, 1
}
m
(resp.
b
∈ {
0, 1
}
m
). Let
B
be the set of rounds that were chosen to
perform parameter estimation.
3. Alice and Bob publicly reveal their choices of inputs. Let
C
be the set of rounds
i
in which
(
x
i
,
y
i
) =
(
2, 1
)
. If
||
C
| −
m
/6
|
>
10
√
m
they abort the protocol.
4. Alice and Bob perform information reconciliation on thei
r outputs in
C
−
B
, which constitute the raw
key. For this, Bob sends a message of
ℓ
≤
H
(
2
η
)
|
C
|
+
log
(
2/
ε
)
bits to Alice.
5. Let
κ
=
κ
(
η
)
be as specified in Theorem 8. Alice and Bob perform privacy amp
lification using e.g.
two-universal hashing, extracting a shared key of length
(
κ
−
H
(
2
η
)
−
O
(
log
(
1/
ε
)
/
m
))
|
C
|
from
the common
(
|
C
| − |
B
|
)
-bit string they obtained at the end of the previous step.
Figure 1: The device-independent key distribution protoco
l, Protocol A
Protocol B
1. Let
m
,
γ
and
η
be parameters given as input.
2. Repeat, for
i
=
1, . . . ,
m
:
2.1 Alice picks
x
i
∈ {
0, 1, 2
}
, and Bob picks
y
i
∈ {
0, 1
}
, uniformly at random. They input
x
i
,
y
i
into their respective device, obtaining outputs
a
i
,
b
i
∈ {
0, 1
}
respectively.
3. Alice chooses a random subset
B
⊆
[
m
]
of size
γ
m
and shares it publicly with Bob. Alice and
Bob announce their input/output pairs in
B
, and compute the fraction of pairs satisfying the CHSH
condition. Let
(
opt
−
η
′
)
be this fraction. If
η
′
>
η
they abort the protocol.
Figure 2: Theorem 8 shows that, at the end of protocol B, the bi
ts
B
C
generated by Bob’s device in the
check rounds
C
both have high smooth min-entropy, conditioned on the adver
sary’s arbitrary quantum side
information.
8
adversary” or “the eavesdropper”) in the protocol, who has a
ccess to the information
X
,
Y
,
A
B
,
B
B
revealed
publicly in the course of the protocol, as well as to a quantum
system
E
which may be correlated with the
systems
A
,
B
of the devices. Such an estimate is stated in Theorem 8 in Sect
ion 3.3 below.
The second step consists in showing that there exists approp
riate protocols for the information reconcil-
iation and privacy amplification steps, Steps 4 and 5 in Proto
col A respectively, such that the lower bound
on the conditional min-entropy from the first step guarantee
s the security (distance from uniform from the
point of view of the adversary) and correctness (Alice and Bo
b should obtain the same key) of the key that
is extracted. This step is standard, and all the ingredients
required already appear in the literature. We
summarize the result as Lemma 4 in Section 3.2 below.
Theorem 1 follows immediately by combining Theorem 8 and Lem
ma 4.
3.1 Probability space
Before stating and proving formally our results, we formall
y define the random variables and events that
will be used in their proof.
Modeling the devices.
Fix a pair of spatially isolated devices
(
A
,
B
)
. Device
A
takes inputs in
{
0, 1, 2
}
,
and device
B
takes inputs in
{
0, 1
}
. Whenever provided an input, each device produces an output
in
{
0, 1
}
.
The devices may be used repeatedly. We will assume that the pa
ir
(
A
,
B
)
can be described by quantum
mechanics: the devices are modeled by a pair of quantum regis
ters; when provided an input each device
performs a measurement on the state contained in the corresp
onding subsystem.
We assume that user Alice holds
A
, and Bob is given
B
. In addition, there is an adversary Eve who
holds an additional quantum register
E
, initialized in a state arbitrarily correlated with that of
A
and
B
. Let
ρ
A
1
B
1
E
be the density matrix describing the joint state of all three
registers at the start of the protocol.
We define the following random variables and events.
X
∈ {
0, 1, 2
}
m
and
Y
∈ {
0, 1
}
m
are two uni-
formly distributed random variables, used to represent the
inputs to
A
,
B
respectively, as chosen in the
protocol.
A
,
B
∈ {
0, 1
}
m
are random variables denoting the outputs produced by the de
vices, when se-
quentially provided their respective inputs
X
,
Y
. We will always use
C
⊆
[
m
]
to denote the set of “check”
rounds, in which
(
X
i
,
Y
i
) = (
2, 1
)
, and
B
⊆
[
m
]
the set of “Bell” rounds chosen by Alice and Bob to
perform parameter estimation.
Let
ρ
A
i
B
i
denote the reduced state of devices
A
and
B
in the
i
-th round of the protocol (before they have
been provided their
i
-th input). Formally,
ρ
A
i
B
i
∝
(
∏
j
<
i
M
A
j
X
j
⊗
N
B
j
Y
j
)
ρ
A
1
B
1
(
∏
j
<
i
(
M
A
j
X
j
)
†
⊗
(
N
B
j
Y
j
)
†
)
,
where
{
M
A
j
X
j
}
and
{
N
B
j
Y
j
}
are the Kraus operators corresponding to the measurement pe
rformed by devices
A
and
B
in round
j
respectively, and
ρ
A
i
B
i
is normalized. Here
ρ
A
1
B
1
=
Tr
E
(
ρ
A
1
B
1
E
)
is the reduced state
of the devices at the start of the protocol. It is important to
note that for any
i
the state
ρ
A
i
B
i
may depend on
a measurement that is performed on system
E
as soon as a particular outcome of that measurement is fixed.
Measuring the CHSH condition.
Given a set
S
⊆
[
m
]
and
δ
>
0
, CHSH
AB
(
S
,
δ
)
is the event that the
tuple
(
X
,
Y
,
A
,
B
)
satisfies the CHSH condition (as described in Section 2) in a f
raction at least
opt
−
δ
of
the rounds indicated by
S
. If
S
is omitted, CHSH
AB
(
δ
) =
CHSH
AB
([
m
]
,
δ
)
. Letting
Z
∈ {
0, 1
}
m
be the
9
indicator random variable of the CHSH condition
not
being satisfied in any given round, we can write
CHSH
AB
(
S
,
δ
)
≡
{
1
|
S
|
∑
i
∈
S
Z
i
≤
(
1
−
opt
) +
δ
}
.
We also define VIOL
AB
(
i
)
, where
i
∈
[
m
]
, to express the expected amount by which the CHSH condition
in round
i
is satisfied:
VIOL
AB
(
i
) =
E
[
Z
i
]
−
(
1
−
opt
)
,
where here the expectation is taken over the choice of inputs
(
X
i
,
Y
i
)
in round
i
, and over the randomness
in the devices’ own measurements in round
i
. Note that VIOL
AB
(
i
)
implicitly depends on the specific state
of the devices in round
i
, which may be affected by previous input and outputs obtaine
d in the protocol
as well as on other events that may be conditioned on. Hence th
e expression
Pr
(
VIOL
AB
(
i
)
<
δ
|
E
)
, for
some event
E
, indicates the average probability, over all possible
e
∈
E
, that the devices satisfy the CHSH
condition in round
i
with probability at least
opt
−
δ
, provided their inputs are distributed according to the
conditional distribution
(
X
i
,
Y
i
)
|
E
=
e
, and when performed on the post-measurement state of
A ⊗ B
in
round
i
conditioned on
E
=
e
. For any
δ
>
0
we let VIOL
AB
(
δ
)
be the event that
(
1/
m
)
∑
i
VIOL
AB
(
i
)
≤
δ
.
The adversary.
We introduce additional random variables that depend on the
adversary Eve, holding the
quantum register
E
. The adversary is described in Lemma 9 below; to understand t
he events below it may
be useful to read that lemma’s statement first.
Let
E
∈ {
0, 1
}
|
C
|
be the random variable that describes the outcome of the meas
urement on
E
described
in Lemma 9. Note that this outcome depends on the “advice” tha
t is given to the adversary. We use
ˆ
X
,
ˆ
Y
to
denote the inputs that are given to the adversary, and
ˆ
A
DV
∈ {
0, 1
}
α
m
to denote the additional advice bits.
These random variables need not equal the actual values
X
,
Y
,
A
DV
: in general, the adversary’s measure-
ment is well-defined for any given advice bits, and
E
is used to denote its outcome irrespective of whether
the advice given was “correct” or not. For any
i
∈
[
m
]
, define GUESS
BE
(
i
)
∈ {
0, 1
}
to be
1
if and only if,
either
i
∈
C
and
E
i
=
B
i
, or
i
/
∈
C
, and let GUESS
BE
=
∧
i
GUESS
BE
(
i
)
.
3.2 Information reconciliation and privacy amplification
For convenience, we let
E
′
:
=
XYA
B
B
B
E
denote the side information available to the eavesdropper.
We
show the following lemma, whose proof follows from standard
arguments in the analysis of QKD protocols
(see e.g. [Ren05]). We provide the relevant details below.
Lemma 4.
Let
γ
,
ε
>
0
. Let
ε
′
=
2
e
−
γ
|
C
|
/400
. Suppose that, after Step 2 of Protocol A, the condition
H
ε
min
(
B
C
|E
′
)
≥
κ
|
C
|
is satisfied. Then with probability at least
1
−
ε
′
, at the end of the protocol Alice and
Bob have a common shared key that is
2
ε
-close to uniform and has length
H
ε
min
(
B
C
|E
′
)
−
H
(
1.1
η
)
|
C
| −
4 log
(
1/
ε
)
.
Information reconciliation.
We first analyze the information reconciliation step. The fo
llowing lemma
states the conditions that are required for there to exist a s
atisfactory information reconciliation procedure.
Lemma 5
(Lemma 6.3.4 in [Ren05])
.
Let
A
,
B
∈ {
0, 1
}
k
be two random variables, and
ε
>
0
. Suppose
Alice holds
A
, and Bob holds
B
. There is an information reconciliation protocol in which B
ob communicates
ℓ
≤
H
ε
max
(
B
|
A
) +
log
(
2/
ε
)
bits of information about
B
to Alice and is such that with probability at least
1
−
ε
Alice and Bob both know
B
at the end of the protocol.
10
To apply Lemma 5 it suffices to prove an upper bound on the condi
tional max-entropy
H
ε
max
(
B
C
|
A
C
)
.
By definition of the rounds
C
, the CHSH condition in those rounds imposes that
A
i
=
B
i
for all
i
∈
C
.
Hence, were it not for errors, we would have
H
ε
max
(
B
|
A
) =
0
. The following claim shows that the bound
on the error rate that results from the estimation performed
in the rounds
B
in Step 3 of Protocol B is enough
to guarantee a good upper bound on the conditional max-entro
py.
Claim 6.
Suppose Alice and Bob do not abort after Step 3 in Protocol B. L
et
C
be the set of check rounds,
as designated in Step 4 of Protocol A. Then
H
ε
′
max
(
B
C
|
C
C
)
≤
H
(
1.1
η
)
|
C
|
, where
ε
′
=
2
e
−
γ
|
C
|
/400
.
Proof.
Fix the set
C
. The set
B
chosen by Alice and Bob to perform parameter estimation cont
ains a fraction
at least
γ
/2
of the rounds in
C
, except with probability at most
e
−
γ
|
C
|
/8
. The protocol is aborted as soon as
more than an
η
fraction of those rounds are such that
a
i
6
=
b
i
. Hence with probability at least
1
−
e
−
γ
|
C
|
/200
the total fraction of errors in
C
is at most
1.1
η
. In particular, with probability at least
1
−
e
−
γ
|
C
|
/400
over
A
C
, with probability at least
1
−
e
−
γ
|
C
|
/400
,
B
C
will take on at most
2
H
(
1.1
η
)
|
C
|
values.
Privacy amplification.
The following lemma states the existence of a good protocol f
or privacy amplifi-
cation.
Lemma 7
(Lemma 6.4.1 in [Ren05])
.
Suppose the information reconciliation protocol requires
at most
ℓ
bits of communication. Then for any
ε
>
0
there is a privacy amplification protocol based on two-unive
rsal
hashing which extracts
H
ε
min
(
B
C
|E
′
)
−
ℓ
−
2 log
(
1/
ε
)
bits of key.
Lemma 4 now follows directly by combining Claim 6 with Lemma 7
and the assumption on the condi-
tional min-entropy placed in the lemma.
3.3 A lower bound on the conditional min-entropy
The main result of this section is a lower bound on the conditi
onal smooth min-entropy H
ε
min
(
B
C
|
XYA
B
B
B
E
)
of the raw key.
Theorem 8.
Let
η
>
0
be given. There exists positive constants
C
ε
,
C
γ
(possibly depending on
η
) such
that the following hold. Let
m
be an integer and
ε
≥
e
−
C
ε
m
be given. Let
γ
= (
C
γ
/
η
2
)
ln
(
1/
ε
)
/
m
be as
specified in Protocol A (Figure 1). Let
κ
be any constant such that
κ
<
(
√
2
−
1
)
/
(
4 ln
(
2
))
−
(
4/ ln
(
2
))
η
.
Suppose that the devices
A
,
B
are such that with probability at least
ε
the protocol does not abort.
Let
E
be an auxiliary system held by an eavesdropper, who may also l
earn
(
X
,
Y
)
and
(
A
B
,
B
B
)
. Then,
conditioned on the protocol not aborting, it holds that
H
ε
min
(
B
C
|
XYA
B
B
B
E
)
≥
κ
|
C
| −
O
(
ln
(
1/
ε
)
)
.
We note that the precise relation between the parameters
κ
and
η
stated in the theorem is the one that
we obtain from our proof; however we have not attempted to opt
imize it fully and it is likely that one may
be able to derive a better dependency. It is also clear from th
e proof that one may trade off the different
constants between each other, depending on whether one is in
terested in the maximum possible key rate in
the presence of very small noise, or to the opposite if one wis
hes to tolerate as much noise as possible.
The proof of Theorem 8 is based on three lemmas. We state the le
mmas first, and derive the theorem
from them below.
11
3.3.1
The reconstruction lemma
Our first lemma states that, if the min-entropy condition in t
he conclusion of the theorem is not satisfied,
then there must exist a measurement on the system
E
, depending on
X
,
Y
,
A
B
and
B
B
, together with some
additional “advice” bits of information about
B
C
, whose outcome
E
∈ {
0, 1
}
|
C
|
agrees with
B
C
with non-
negligible probability.
Lemma 9.
Let
κ
>
0
and suppose that
H
ε
min
(
B
C
|
XYA
B
B
B
E
)
<
κ
|
C
|
. Then there exists an
α
=
κ
|
C
|
/
m
+
2
γ
+
O
(
log
(
m
/
ε
)
/
m
)
and a function
f
:
{
0, 1
}
|
C
|
→ {
0, 1
}
(
α
−
2
γ
)
m
such that, given the bits
A
DV
=
f
A
DV
(
B
C
)
A
B
B
B
∈ {
0, 1
}
α
m
together with the inputs
X
,
Y
, there exists a measurement on
E
that outputs a
string
e
∈ {
0, 1
}
|
C
|
such that with probability (over the randomness in
B
and in the measurement) at least
C
E
(
ε
/
m
)
6
, where
C
E
is a universal constant, the equality
e
=
b
C
holds.
The proof of Lemma 9 is based on a “reconstruction”-type argu
ment from [DPVR12]. A very similar
argument was already used to establish an analogous lemma in
[VV11]. We give the proof of Lemma 9 in
Section 5.
3.3.2
Existence of a good round
Our second lemma states the existence of a “good” round
i
0
∈
[
m
]
in which both the CHSH condition is
satisfied, and the outcome
E
i
0
of the measurement described in Lemma 9 agrees with
B
i
0
, with good prob-
ability. Note also the additional condition (2) in the lemma
, which states that systems
A
and
B
are each
close to being independent from the random variables
X
i
0
,
Y
i
0
describing the choice of inputs in round
i
0
.
This condition is necessary for condition (3), on the CHSH vi
olation, to be of any use: indeed, without (2)
it could in principle be that the conditioning on specific out
comes in previous rounds, including the adver-
sary’s outcomes, completely fixes the choice of inputs in the
i
0
-th round. Conditions (2)–(4) in the lemma
correspond to conditions (i)–(iii) discussed in Section 1.
2.
Eq. (2) implies that the distribution that arises from the de
vices’ measurements on the states
ρ
A
i
0
B
i
0
is,
while not necessarily quantum, still no-signalling, and th
is is all that is required for the application of the
guessing lemma, Lemma 11 below. As explained in the introduc
tion, proving this condition is an impor-
tant point of departure of our proof from previous approache
s, which used an assumption of independence
between the devices or a limitation of the adversary in order
to automatically obtain that (an even stronger
form of) the condition held in all rounds without requiring a
ny conditioning.
We refer to Section 3.1 for a description of the events CHSH
AB
and VIOL
AB
appearing in the statement
of the lemma.
Lemma 10.
Let
ˆ
A
DV
be uniformly distributed in
{
0, 1
}
α
m
, and
η
,
ε
>
0
be such that the following holds:
Pr
(
CHSH
AB
(
η
)
∧
GUESS
BE
|
A
DV
=
ˆ
A
DV
)
≥
ε
,
and let
α
=
|
A
DV
|
/
m
. Then there exists a universal constant
C
ν
>
0
, a
ν
≤
C
ν
√
log
(
1/
ε
)
/
m
, an
i
0
∈
[
m
]
and a set
G
i
0
⊆
(
{
0, 1, 2
} × {
0, 1
} × {
0, 1
}
3
)
i
0
−
1
such that for every
(
x
,
y
,
a
,
b
,
e
)
∈
G
i
0
, there is
12
a choice of
ˆ
x
>
i
0
,
ˆ
y
>
i
0
and an
ˆ
A
DV
consistent with
((
x
,
ˆ
x
>
i
0
)
,
(
y
,
ˆ
y
>
i
0
)
,
a
,
b
)
such that the following hold:
max
{
∥
∥
∥
ρ
A
i
0
X
i
0
Y
i
0
−
ρ
A
i
0
⊗
(
1
6
∑
x
,
y
|
x
,
y
ih
x
,
y
|
)
∥
∥
∥
1
,
∥
∥
∥
ρ
B
i
0
X
i
0
Y
i
0
−
ρ
B
i
0
⊗
(
1
6
∑
x
,
y
|
x
,
y
ih
x
,
y
|
)
∥
∥
∥
1
}
≤
ν
,
(2)
VIOL
AB
(
i
0
)
≤
3
η
+
ν
,
(3)
Pr
(
GUESS
BE
(
i
0
))
≥
1
−
12 ln
(
2
)
α
−
ν
,
(4)
where in
(2)
the state
ρ
A
i
0
B
i
0
X
i
0
Y
i
0
is the (normalized) state of the corresponding systems in ro
und
i
0
, con-
ditioned on
(
x
,
y
,
a
,
b
,
e
)
, and similarly in
(3)
and
(4)
the violation is estimated conditioned on previous
input/outputs to the devices being
(
x
,
y
,
a
,
b
)
, and on Eve making her measurement based on the inputs
(
x
, 2,
ˆ
x
>
i
0
)
and
(
y
, 1,
ˆ
y
>
i
0
)
and advice string
ˆ
A
DV
, and obtaining outcomes
e
as her prediction in rounds
C
∩ {
1, . . . ,
i
0
−
1
}
.
The proof of Lemma 10 in given in Section 4.
3.3.3
The guessing lemma
We state the last lemma required for the proof of Theorem 8. A s
imilar lemma already appeared in [VV11].
Here we give a slightly more general version of the lemma stat
ed in a form that can be directly used in the
proof of the theorem.
Lemma 11
(Guessing lemma)
.
Let
δ
,
ν
,
η
>
0
. Suppose given six bipartite states
ρ
xy
AB
, where
x
∈ {
0, 1, 2
}
,
y
∈ {
0, 1
}
, such that the following hold:
1. If
ρ
A
= (
1/6
)
∑
xy
Tr
B
(
ρ
xy
AB
)
and
ρ
B
= (
1/6
)
∑
xy
Tr
A
(
ρ
xy
AB
)
,
1
6
∑
x
,
y
∥
∥
ρ
A
−
ρ
xy
A
∥
∥
1
≤
ν
and
1
6
∑
x
,
y
∥
∥
ρ
B
−
ρ
xy
B
∥
∥
1
≤
ν
,
(5)
2. There exists observables
A
x
=
A
0
x
−
A
1
x
,
B
y
=
B
0
y
−
B
1
y
on
A
,
B
respectively that satisfy
1
4
(
Tr
(
(
A
0
⊗
B
0
)
ρ
00
AB
)
+
Tr
(
(
A
0
⊗
B
1
)
ρ
01
AB
)
+
Tr
(
(
A
1
⊗
B
0
)
ρ
10
AB
)
−
Tr
(
(
A
1
⊗
B
1
)
ρ
11
AB
)
)
≥
√
2
2
−
η
,
3. Bob’s measurement
B
1
produces outcome
b
1
∈ {
0, 1
}
with probability
1
−
δ
, when performed on his
share of
ρ
21
AB
:
Tr
((
Id
⊗
B
b
1
1
)
ρ
21
AB
)
≥
1
−
δ
.
Then the condition
δ
≥
(
√
2
−
1
2
−
η
)
−
75
ν
must hold.
13
Proof.
For every
(
a
,
b
,
x
,
y
)
∈ {
0, 1
}
2
× {
0, 1, 2
} × {
0, 1
}
let
p
(
a
,
b
|
x
,
y
)
:
=
Tr
((
A
a
x
⊗
B
b
y
)
ρ
xy
AB
)
. Con-
dition (5) implies that the distribution
p
is approximately no-signalling, in the following sense: on
average
over the choice of a uniformly random pair
(
x
,
y
)
, the statistical distance
1
6
∑
x
,
y
∑
a
∣
∣
∣
∑
b
p
(
a
,
b
|
x
,
y
)
−
1
2
∑
y
′
(
∑
b
p
(
a
,
b
|
x
,
y
′
)
)
∣
∣
∣
≤
1
6
∑
x
,
y
∑
a
∣
∣
Tr
(
(
A
a
x
⊗
Id
)(
ρ
xy
AB
−
ρ
x
AB
)
)
∣
∣
≤
1
6
∑
x
,
y
∥
∥
ρ
xy
AB
−
ρ
x
AB
∥
∥
1
≤
2
ν
,
and a similar bound holds for the marginals on
B
. Lemma 9.5 in [Hol09] implies that there exists a distribu-
tion
q
(
a
,
b
|
x
,
y
)
such that
q
is (perfectly) no-signalling, and moreover, on average ove
r
(
x
,
y
)
the statistical
distance
k
p
(
·
,
·|
x
,
y
)
−
q
(
·
,
·|
x
,
y
)
k
1
≤
10
ν
. In particular, the second assumption in the lemma implies t
hat
the distribution
q
must violate the CHSH inequality by at least
√
2/2
−
η
−
15
ν
, and the third assump-
tion implies that
∑
a
q
(
a
, 1
|
2, 1
)
≥
1
−
δ
−
60
ν
. Applying the bound (A.11) derived in the supplementary
information to [PAM
+
10] with
I
/4
=
√
2/2
−
η
−
15
ν
we obtain the inequality claimed in the lemma.
3.3.4
Proof of Theorem 8
We give the proof of Theorem 8, assuming the lemmas stated in t
he three previous subsections.
Proof of Theorem 8.
Let
(
X
,
Y
,
A
,
B
)
be random variables describing Alice and Bob’s choice of inp
uts to
A
and
B
respectively, and the outputs obtained, in an execution of P
rotocol A. Let
E
=
E
(
ˆ
A
DV
)
be the random
variable that describes the outcome of the measurement on
E
described in Lemma 9, when the advice bits
ˆ
A
DV
are selected uniformly at random (independently from
A
and
B
). Denote by A
DV
=
f
A
DV
(
B
C
)
A
B
B
B
the “correct” advice bits.
The proof proceeds by contradiction. Assume that there exis
ted a pair of devices
(
A
,
B
)
such that
Pr
(
CHSH
AB
(
B
,
η
)
)
≥
ε
,
H
ε
min
(
B
C
|
XYA
B
B
B
E
)
<
κ
|
C
|
,
(6)
where
ε
,
η
,
κ
are as in the statement of the theorem. Denote GUESS
BE
(
ˆ
A
DV
)
the event that
E
=
B
C
. Using
Lemma 9, we deduce from (6) that the following must hold:
Pr
(
CHSH
AB
(
B
,
η
)
∧
GUESS
BE
(
ˆ
A
DV
)
|
ˆ
A
DV
=
A
DV
)
=
Pr
(
GUESS
BE
(
ˆ
A
DV
)
|
CHSH
AB
(
B
,
η
)
,
ˆ
A
DV
=
A
DV
)
·
Pr
(
CHSH
AB
(
B
,
η
)
|
ˆ
A
DV
=
A
DV
)
≥
C
E
(
ε
/
m
)
6
·
ε
,
(7)
where
C
E
is the constant from Lemma 9. Since the rounds
B
are chosen uniformly at random, Claim 12
below states that, for any
0
≤
β
≤
1
:
Pr
(
CHSH
AB
((
1
+
β
)
η
)
|
CHSH
AB
(
B
,
η
)
)
≥
1
−
e
−
2
β
2
η
2
γ
m
,
(8)
where
γ
=
|
B
|
/
m
. Choose
β
=
1/3
, and let
η
′
:
=
4
η
/3
. Provided
C
γ
is chosen large enough, the choice of
γ
made in the theorem is such that
γ
≥
log
(
2
m
6
/
C
E
ε
7
)
/
((
2/9
)
η
2
m
)
, so that
e
−
2
β
2
η
2
γ
m
≤
C
E
ε
7
/
(
2
m
6
)
.
Hence we obtain the following by combining (7) and (8):
Pr
(
CHSH
AB
(
η
′
)
∧
GUESS
BE
(
ˆ
A
DV
)
|
ˆ
A
DV
=
A
DV
)
≥
C
E
(
ε
7
/
(
2
m
6
)) =
:
ε
′
.
(9)
14
We may now apply Lemma 10. Let
ν
=
C
ν
√
log
(
1/
ε
′
)
/
m
, and
i
0
∈
[
m
]
be the “good” round that is
promised by the lemma. We proceed to show that the existence o
f such a round leads to a contradiction by
appealing to the guessing lemma, Lemma 11.
Consider the following setup. Alice, Bob and Eve prepare the
ir devices by selecting a random string of
inputs
ˆ
x
,
ˆ
y
for Eve, except that
ˆ
x
i
0
=
2
and
ˆ
y
i
0
=
1
always. Eve guesses the advice bits
ˆ
A
DV
at random
and makes a prediction
E
=
e
. Alice and Bob then use their devices up to round
i
0
−
1
by choosing inputs
(
x
<
i
0
,
y
<
i
0
) = (
ˆ
x
<
i
0
,
ˆ
y
<
i
0
)
. They verify that the resulting outputs
a
<
i
0
,
b
<
i
0
are such that
(
x
<
i
0
,
y
<
i
0
,
a
<
i
0
,
b
<
i
0
,
e
<
i
0
)
∈
G
i
0
;
if not they abort. Upon having succeeded in this conditionin
g they separate and play the guessing game.
Alice holds system
A
, while Bob holds system
B
.
Lemma 10 shows that all conditions in Lemma 11 are satisfied: a
s a result, it must be that
12 ln
(
2
)
α
+
ν
≥
(
√
2
−
1
2
−
6
η
′
−
2
ν
)
−
75
ν
.
By definition, provided the constant
C
ν
is large enough we have
α
≤
κ
/6
+
2
γ
+
ν
, where we used that
|
C
| ≤
m
/6
+
10
√
m
=
m
/6
+
O
(
√
ln
(
1/
ε
))
, as enforced in the protocol, and
η
′
=
4/3
η
. Re-arranging
terms and using the definition of
ν
and
γ
we obtain the condition
κ
>
√
2
−
1
4 ln
(
2
)
−
4
ln
(
2
)
η
−
O
(
log
(
1/
ε
)
η
2
m
)
,
which, given the choice of
κ
made in the theorem, is a contradiction provided
C
ε
is chosen small enough.
Claim 12.
Let
η
,
γ
>
0
. The following holds for any
0
≤
β
≤
1
:
Pr
S
(
CHSH
((
1
+
β
)
η
)
|
CHSH
(
S
,
η
)
)
≥
1
−
e
−
2
β
2
η
2
γ
m
,
where the probability is taken over the choice of a random sub
set
S
⊆
[
m
]
of size
|
S
|
=
γ
m
.
Proof.
Consider a given run of the protocol. Suppose that the fracti
on of rounds in which the CHSH
condition is not satisfied is at least
(
1
−
opt
) + (
1
+
β
)
η
. By a standard Chernoff bound, a randomly
chosen set
S
⊆
[
m
]
will of size
γ
m
will have at least
((
1
−
opt
) +
η
)
γ
m
of its rounds with inputs
corresponding to the CHSH condition being violated, except
with probability at most
e
−
2
β
2
η
2
γ
m
.
4 Proof of Lemma 10
This section is devoted to the proof of Lemma 10. Let
D
be the event CHSH
AB
(
η
)
∧
GUESS
BE
: the main
assumption of the lemma states that
Pr
(
D
|
A
DV
=
ˆ
A
DV
)
≥
ε
. We first prove two preliminary claims which
establish that, provided
ε
is not too small, conditioning on
D
does not affect either the distribution of inputs
(
X
i
,
Y
i
)
or the reduced density matrices of the inner state of each dev
ice’s system in most rounds
i
by too
much.
Claim 13.
Suppose that, in Protocol B, Alice and Bob choose inputs
(
X
,
Y
)
∈ {
0, 1, 2
}
m
× {
0, 1
}
m
uni-
formly at random, obtaining outcomes
A
,
B
∈ {
0, 1
}
m
. Suppose that
E
is measured using Eve’s guessing
15
measurement (as described in Lemma 9) with inputs
(
ˆ
X
,
ˆ
Y
) = (
X
,
Y
)
and advice bits
ˆ
A
DV
=
A
DV
, re-
sulting in an outcome
E
∈ {
0, 1
}
|
C
|
. Let
P
X
i
Y
i
be the marginal distribution of the inputs in the
i
-th round,
conditioned on
(
X
<
i
,
Y
<
i
,
A
<
i
,
B
<
i
,
E
<
i
) = (
x
<
i
,
y
<
i
,
a
<
i
,
b
<
i
,
e
<
i
)
∈
D
<
i
, the projection of
D
on the first
(
i
−
1
)
coordinates. Then the following bound holds on expectation
over
(
x
<
i
,
y
<
i
,
a
<
i
,
b
<
i
,
e
<
i
)
:
1
m
∑
i
∥
∥
P
X
i
Y
i
−
U
3
×
2
∥
∥
1
≤
√
log
(
1/
ε
)
2
m
,
where
U
3
×
2
is the uniform distribution on
{
0, 1, 2
} × {
0, 1
}
.
Proof.
The Shannon entropy
H
(
X
,
Y
) =
log
(
6
)
m
, and conditioned on
D
,
H
(
X
,
Y
|
D
)
≥
log
(
6
)
m
−
log
(
1/
ε
)
. Applying the chain rule,
1
m
∑
i
H
(
X
i
,
Y
i
|
X
<
i
,
Y
<
i
,
D
<
i
)
≥
log
(
6
)
−
log
(
1/
ε
)
m
.
Using the classical Pinsker’s inequality as
k
P
X
i
Y
i
−
U
3
×
2
k
1
≤
√
(
log
(
6
)
−
H
(
X
i
,
Y
i
))
/2
and Jensen’s
inequality we get
1
m
∑
i
∥
∥
P
X
i
Y
i
−
U
3
×
2
∥
∥
1
≤
√
log
(
1/
ε
)
2
m
,
proving the claim.
The fact that
D
depends both on the choice of inputs
(
X
,
Y
)
and on the adversary’s measurement out-
come implies that conditioning on
D
could not only bias the distribution of
(
X
,
Y
)
but also introduce cor-
relations between
(
X
,
Y
)
and the reduced state
ρ
AB
of the devices. The following claim shows that, if
D
is an event with large enough probability, the correlations
introduced by this conditioning do not affect the
reduced state on either
A
or
B
by too much, for most rounds
i
.
Claim 14.
Consider the same situation as described in Claim 13. Let
ρ
A
i
X
i
Y
i
denote the reduced den-
sity of the joint state of systems
A
(in round
i
) and
X
i
,
Y
i
, conditioned on
(
X
<
i
,
Y
<
i
,
A
<
i
,
B
<
i
,
E
<
i
) =
(
x
<
i
,
y
<
i
,
a
<
i
,
b
<
i
,
e
<
i
)
∈
D
<
i
. Then the following holds on expectation over
(
x
<
i
,
y
<
i
,
a
<
i
,
b
<
i
,
e
<
i
)
:
1
m
∑
i
∥
∥
∥
ρ
A
i
X
i
Y
i
−
ρ
A
i
⊗
(
1
6
∑
x
,
y
|
x
,
y
ih
x
,
y
|
)
∥
∥
∥
1
≤
4
√
log
(
1/
ε
)
/
m
.
(10)
Moreover, the same bound holds when
A
i
is replaced by
B
i
.
Proof.
We use Claim 27. Alice’s sequential measurements are taken t
o be the ones performed on
A
, while
Bob’s measurement is the combination of the measurements on
B
, together with Eve’s measurement, on
inputs
X
,
Y
and advice bits
ˆ
A
DV
=
A
DV
obtained from
B
. We set
X
in the claim to be
XY
here, and the
outcomes
B
in the claim to
BE
here. Together with the assumption
Pr
(
D
|
ˆ
A
DV
=
A
DV
)
≥
ε
, the claim
shows that
1
m
∑
i
I
(
A
i
;
X
i
Y
i
|
D
<
i
)
ρ
A
i
X
i
Y
i
≤
log
(
1/
ε
)
m
.
Using Pinsker’s inequality (1) together with Jensen’s ineq
uality,
1
m
∑
i
∥
∥
∥
ρ
A
i
X
i
Y
i
−
ρ
A
i
⊗
(
1
6
∑
xy
|
x
,
y
ih
x
,
y
|
)
∥
∥
∥
1
≤
4
√
log
(
1/
ε
)
/
m
,
where we used Claim 13 to show that the marginal distribution
of
(
X
i
,
Y
i
)
is close to uniform on
{
0, 1, 2
} ×
{
0, 1
}
, even conditioned on
D
<
i
.
16
The following claim replaces the event that the CHSH conditi
on is satisfied in a large fraction of rounds
by the event that their exists many rounds in which the CHSH co
ndition is
likely
to be satisfied (when
evaluated on the state of the devices in that round).
Claim 15.
There exists a set
T
⊆
[
m
]
such that
|
T
| ≥
2
m
/3
, and a subset
D
′
⊆
D
such that
Pr
(
D
′
|
D
)
≥
1/2
and for every
i
∈
T
, conditioned on
ˆ
A
DV
=
A
DV
and on inputs and outputs to the devices in rounds
prior to
i
being in
D
′
, the condition
VIOL
AB
(
i
)
≤
3
η
+
6
√
ln
(
1/
ε
)
/
m
holds.
Proof.
Let
Z
i
∈ {
0, 1
}
be
1
if and only if the CHSH condition is not satisfied in round
i
. By definition,
E
[
Z
i
] = (
1
−
opt
) +
VIOL
AB
(
i
)
. Let
W
i
=
E
[
Z
i
]
−
Z
i
and
W
≤
i
=
W
1
+
· · ·
+
W
i
.
(
W
≤
i
)
i
is a
Martingale, and by Azuma’s inequality, for any
β
>
0
Pr
(
1
m
∑
i
VIOL
AB
(
i
) + (
1
−
opt
)
>
1
m
∑
i
Z
i
+
β
)
=
Pr
(
1
m
∑
i
W
i
>
β
)
≤
e
−
β
2
m
/2
.
Since the string
ˆ
A
DV
is chosen by the adversary uniformly at random, we may furthe
r condition the equa-
tions above on
ˆ
A
DV
=
A
DV
without affecting their validity. Note that the event CHSH
AB
(
η
)
is equivalent
to
1
m
∑
i
Z
i
≤
(
1
−
opt
) +
η
. Choosing
β
=
√
2 ln
(
2/
ε
)
/
m
, so that
e
−
β
2
m
/2
<
ε
/4
, and using the
assumption
Pr
(
D
|
ˆ
A
DV
=
A
DV
)
≥
ε
to further condition on
D
=
CHSH
AB
(
η
)
∧
GUESS
BE
we get
Pr
(
1
m
∑
i
VIOL
AB
(
i
)
>
η
+
β
∣
∣
D
,
ˆ
A
DV
=
A
DV
)
≤
1/2.
The quantity VIOL
AB
(
i
)
is a nonnegative number which only depends on the state of the
devices in round
i
, itself only depending on the string of inputs and outputs ob
served thus far. Applying Markov’s inequality,
the condition above implies that there is a set
T
⊆
[
m
]
of size
|
T
| ≥
2
m
/3
and a subset
D
′
⊆
D
of size
Pr
(
D
′
|
D
)
≥
1/2
such that for every
i
∈
T
it holds that VIOL
AB
(
i
)
≤
3
(
η
+
β
)
, provided previous inputs
and outputs of the devices were in
D
′
.
Proof of Lemma 10.
Let
D
′
be the set from Claim 15. Consider the state of the devices
A
and
B
in an arbi-
trary round
i
of the protocol. By applying Markov’s inequality to the boun
d (10) from Claim 14, we obtain a
set
|
T
′
| ⊆
[
m
]
of size
|
T
′
| ≥
11
m
/12
and a subset
D
′′
⊆
D
′
satisfying
Pr
(
D
′′
|
D
′
)
≥
1/2
such that, for ev-
ery
i
∈
T
′
, conditioned on
ˆ
A
DV
=
A
DV
and
(
X
<
i
,
Y
<
i
,
A
<
i
,
B
<
i
,
E
<
i
) = (
x
<
i
,
y
<
i
,
a
<
i
,
b
<
i
,
e
<
i
)
∈
D
′′
<
i
,
both bounds
∥
∥
∥
ρ
A
i
X
i
Y
i
−
ρ
A
i
⊗
(
1
6
∑
x
,
y
|
x
,
y
ih
x
,
y
|
)
∥
∥
∥
1
≤
200
√
log
(
1/
ε
)
/
m
and the analogous bound where
A
i
is replaced by
B
i
hold. Letting
T
′′
=
T
′
∩
T
, where
T
is the set from
Claim 15, both the bound above and the condition VIOL
AB
(
i
)
≤
3
η
+
6
√
ln
(
1/
ε
)
/
m
hold simultaneously
in the rounds from
T
′′
(conditioned on previous inputs and outputs being in
D
′′
). Furthermore, note that
whether both conditions are satisfied or not only depends on t
he (post-selected) state of the protocol in round
i
, itself only depending on subsequent choices of inputs and o
utputs in the protocol to the extent that the
condition
ˆ
A
DV
=
A
DV
is satisfied. Hence as long as the advice bits
ˆ
A
DV
that Eve uses to select the mea-
surement on her system have a positive probability of being t
he correct advice bits, given the data generated
up to round
i
−
1
, both bounds must hold verbatim. As a consequence, for any fix
ed
(
x
,
y
,
a
,
b
,
e
)
∈
D
′′
<
i
there exists a string
(
ˆ
x
>
i
,
ˆ
y
>
i
,
ˆ
a
>
i
,
ˆ
b
>
i
)
from which advice bits
ˆ
A
DV
>
i
can be computed such that if Eve
17