of 44
arXiv:1909.12497v1 [math.CO] 27 Sep 2019
Edge Expansion and Spectral Gap of Nonnegative Matrices
Jenish C. Mehta
California Institute of Technology
jenishc@gmail.com
Leonard J. Schulman
California Institute of Technology
schulman@caltech.edu
Abstract
The classic graphical Cheeger inequalities state that if
M
is an
n
×
n symmetric
doubly stochastic
matrix, then
1
λ
2
(
M
)
2
φ
(
M
)
Æ
2
·
(
1
λ
2
(
M
))
where
φ
(
M
)=
min
S
[
n
]
,
|
S
|≤
n
/
2
€
1
|
S
|
P
i
S
,
j
6∈
S
M
i
,
j
Š
is the edge expansion of
M
, and
λ
2
(
M
)
is the second
largest eigenvalue of
M
. We study the relationship between
φ
(
A
)
and the spectral gap 1
Re
λ
2
(
A
)
for
any
doubly stochastic matrix
A
(not necessarily symmetric), where
λ
2
(
A
)
is a nontrivial eigenvalue
of
A
with maximum real part. Fiedler showed that the upper bound o
n
φ
(
A
)
is unaffected, i.e.,
φ
(
A
)
p
2
·
(
1
Re
λ
2
(
A
))
. With regards to the lower bound on
φ
(
A
)
, there are known constructions
with
φ
(
A
)
Θ

1
Re
λ
2
(
A
)
log
n
‹
,
indicating that at least a mild dependence on
n
is necessary to lower bound
φ
(
A
)
.
In our first result, we provide an
exponentially
better construction of
n
×
n
doubly stochastic
matrices
A
n
, for which
φ
(
A
n
)
1
Re
λ
2
(
A
n
)
p
n
.
In fact,
all
nontrivial eigenvalues of our matrices are 0, even though th
e matrices are highly
non-
expanding
. We further show that this bound is in the correct range (up to
the exponent of
n
), by
showing that for any doubly stochastic matrix
A
,
φ
(
A
)
1
Re
λ
2
(
A
)
35
·
n
.
As a consequence, unlike the symmetric case, there is a (nece
ssary) loss of a factor of
n
α
for
1
2
α
1
in lower bounding
φ
by the spectral gap in the nonsymmetric setting.
Our second result extends these bounds to general matrices
R
with nonnegative entries, to obtain
a two-sided
gapped
refinement of the Perron-Frobenius theorem. Recall from the
Perron-Frobenius
theorem that for such
R
, there is a nonnegative eigenvalue
r
such that all eigenvalues of
R
lie within
the closed disk of radius
r
about 0. Further, if
R
is irreducible, which means
φ
(
R
)
>
0 (for suitably
defined
φ
), then
r
is positive and all other eigenvalues lie within the
open
disk, so (with eigenvalues
sorted by real part), Re
λ
2
(
R
)
<
r
. An extension of Fiedler’s result provides an upper bound an
d
our result provides the corresponding lower bound on
φ
(
R
)
in terms of
r
Re
λ
2
(
R
)
, obtaining a
two-sided quantitative version of the Perron-Frobenius th
eorem.
1
1 Introduction
1.1 Motivation and main result
We study the relationship between edge expansion and second
eigenvalue of nonnegative matrices. We
restrict to doubly stochastic matrices for exposition in th
e introduction, since the definitions are simpler
and it captures most of the key ideas. The extension to genera
l nonnegative matrices is treated in Section
1.2
. Let
A
be an
n
×
n
doubly stochastic matrix, equivalently interpreted as a bi
-regular weighted digraph.
The
edge expansion
of
A
, denoted as
φ
(
A
)
, is defined as
φ
(
A
)=
min
S
[
n
]
,
|
S
|≤
n
/
2
P
i
S
,
j
6∈
S
A
i
,
j
|
S
|
.
The fact that
A
is doubly stochastic implies that
φ
(
A
)=
φ
(
A
T
)
.
φ
is a measure of
how much
the graph
would have to be modified for it to lose strong-connectedness
; it also lower bounds
how frequently
,
in steady state, the associated Markov chain switches betwe
en the blocks of any bi-partition; thus, it
fundamentally expresses the
extent
to which the graph is connected.
The connection between edge expansion and the second eigenv
alue has been of central importance
in the case of
symmetric
doubly stochastic matrices
M
(equivalently, reversible Markov chains with
uniform stationary distribution). For such
M
, let the eigenvalues be 1
=
λ
1
λ
2
...
λ
n
≥−
1.
The Cheeger inequalities give two-sided bounds between the
edge expansion
φ
(
M
)
and the
spectral gap
1
λ
2
(
M
)
.
Theorem 1.
[
Dod84
,
AM85
,
Alo86
]
(Cheeger’s inequalities for symmetric doubly stochastic m
atrices)
Let M be a
symmetric
doubly stochastic matrix, then
1
λ
2
(
M
)
2
φ
(
M
)
Æ
2
·
(
1
λ
2
(
M
))
.
This is closely related to earlier versions for Riemannian m
anifolds
[
Che70
,
Bus82
]
. Notably, the
inequalities in Theorem
1
do not depend on
n
. Further, they are tight up to constants – the upper bound
on
φ
is achieved by the cycle and the lower bound by the hypercube.
The key question we address in this work is whether or to what e
xtent the Cheeger inequalities sur-
vive for
nonsymmetric
doubly stochastic matrices (the question was already asked
, for e.g., in
[
MT06
]
).
Let
A
be a doubly stochastic matrix, not necessarily symmetric. T
he eigenvalues of
A
lie in the unit disk
around the origin in the complex plane, with an eigenvalue 1 c
alled the
trivial
or
stochastic
eigenvalue
corresponding to the eigenvector
1
(the all 1’s vector), and all other eigenvalues considered
nontrivial
.
Let the eigenvalues of
A
be ordered so that 1
=
λ
1
Re
λ
2
...
Re
λ
n
≥−
1. The spectral gap will
be defined as 1
Re
λ
2
(
A
)
. There are three motivations for this definition: first, the c
ontinuous-time
Markov chain based on
A
is exp
(
t
·
(
A
I
))
, and this spectral gap specifies the largest norm of any of the
latter’s nontrivial eigenvalues; second, Re
λ
2
(
A
)=
1 if and only if
φ
=
0 (i.e., if the matrix is reducible);
finally, this gap lower bounds the distance (in the complex pl
ane) between the trivial and any nontrivial
eigenvalue.
It was noted by Fiedler
[
Fie95
]
that the upper bound on
φ
in Cheeger’s inequality (Theorem
1
)
carries over easily to general doubly stochastic matrices,
because for
M
= (
A
+
A
T
)
/
2,
φ
(
A
) =
φ
(
M
)
and Re
λ
2
(
A
)
λ
2
(
M
)
. (In fact, Fiedler made a slightly different conclusion wit
h these observations,
but they immediately give the upper bound on
φ
, see Appendix
C
for an extension and proof).
2
Lemma 2.
(Fiedler
[
Fie95
]
)
Let A be
any
n
×
n doubly stochastic matrix, then
φ
(
A
)
Æ
2
·
(
1
Re
λ
2
(
A
))
.
It remained to investigate the other direction, i.e., the lo
wer bound on
φ
(
A
)
in terms of the spectral
gap 1
Re
λ
2
(
A
)
. Towards this end, we define the function
Definition 3.
Γ
(
n
)=
min
§
φ
(
A
)
1
Re
λ
2
(
A
)
:
A
is an
n
×
n
doubly stochastic matrix
ª
.
For symmetric doubly stochastic matrices this minimum is no
less than 1
/
2. However, for doubly
stochastic matrices
A
that are not necessarily symmetric, there are known (but per
haps not widely
known) examples demonstrating that it is impossible to have
a function
Γ
entirely independent of
n
.
These examples, discussed in Section
3.1
, show that
Γ
(
n
)
1
log
n
.
(1)
One reason that
φ
and 1
Re
λ
2
are important is their connection to mixing time
τ
– the number of
steps after which a random walk starting at any vertex conver
ges to the uniform distribution over the
vertices. For the case of symmetric doubly stochastic matri
ces – or in general reversible Markov chains
– it is simple to show that
τ
O
€
log
n
1
λ
2
Š
, and by Cheeger’s inequality (Theorem
1
), it further gives
τ
O
€
log
n
φ
2
Š
where
n
is the number of vertices. For the case of general doubly stoc
hastic matrices – or
in general not-necessarily-reversible Markov chains – it s
till holds that
τ
O
€
log
n
φ
2
Š
by a result of Mihail
(
[
Mih89
]
, and see
[
Fil91
]
). Depending on the situation, either Re
λ
2
or
φ
may be easier to estimate.
Most often, one is interested in concisely-specified chains
on exponential-size sets, i.e., the number of
vertices is
n
but the complexity parameter is log
n
. In this case, either
φ
or
λ
2
(for the reversible case)
can be used to estimate
τ
. However, if the matrix or chain is given explicitly, then on
e reason the
Cheeger inequalites are useful is because it is simpler to es
timate
τ
using
λ
2
which is computable in P
while computing
φ
is NP-hard
[
GJS74
]
.
From the point of view of applications to Markov Chain Monte C
arlo (MCMC) algorithms, a log
n
loss
in the relation between
φ
and 1
Re
λ
2
as implied by (
1
), is not a large factor. For MCMC algorithms,
since “
n
” is the size of the state space being sampled or counted and th
e underlying complexity parameter
is log
n
, if it were true that
Γ
(
n
)
log
c
n
for some constant
c
, then the loss in mixing time estimates
would be polynomial in the complexity parameter, and thus, t
he quantity 1
Re
λ
2
could still be used
to obtain a reasonable estimate of the mixing time even in the
case of nonsymmetric doubly stochastic
matrices.
However, the truth is much different. Our main result is that
Γ
(
n
)
does not scale as log
c
n
, but is
exponentially
smaller.
Theorem 4.
(Bounds on
Γ
)
1
35
·
n
Γ
(
n
)
1
p
n
.
We give an
explicit
construction of doubly stochastic matrices
A
n
for the upper bound on
Γ
. This con-
struction of highly nonexpanding doubly stochastic matric
es has, in addition, the surprising property
that
every
nontrivial eigenvalue is 0. Thus, for non-reversible Marko
v chains, the connection between
φ
3
and Re
λ
2
breaks down substantially, in that the upper bound on mixing
time obtained by lower bound-
ing
φ
by the spectral gap 1
Re
λ
2
can be be exponentially weaker (when the complexity paramet
er is
log
n
) than the actual mixing time, whereas for reversible chains
there is at worst a quadratic loss.
This theorem has a very natural extension to general nonnega
tive matrices, as we next describe.
1.2 General nonnegative matrices and a two-sided quantitati
ve refinement of the Perron-
Frobenius theorem
We extend our results to general nonnegative matrices
R
. By the Perron-Frobenius theorem (see The-
orem
8
), since
R
is nonnegative, it has a nonnegative eigenvalue
r
(called the PF eigenvalue) that is
also largest in magnitude amongst all eigenvalues, and the c
orresponding left and right eigenvectors
u
and
v
have all nonnegative entries. Further, if
R
is irreducible, i.e., the underlying weighted digraph
on edges with positive weight is strongly connected (for eve
ry
(
i
,
j
)
there is a
k
such that
R
k
(
i
,
j
)
>
0),
then
r
is a simple eigenvalue, and
u
and
v
have all positive entries. We henceforth assume that nonzer
o
nonnegative
R
has been scaled (to
1
r
R
) so that
r
=
1. Thus we can again write the eigenvalues of
R
as
1
=
λ
1
(
R
)
Re
λ
2
(
R
)
≥···≥
Re
λ
n
(
R
)
≥−
1. Scaling changes the spectrum but not the edge expansion
(which we define next), so this canonical scaling is necessar
y before the two quantities can be compared.
Defining the edge expansion of
R
is slightly delicate so we explain the reasoning after Defini
tion
11
, and present only the final definition (see Definition
11
) here. Consider
u
and
v
as left and right
eigenvectors for eigenvalue 1 of
R
. If
every
such pair
(
u
,
v
)
for
R
has some
i
such that
u
(
i
) =
0 or
v
(
i
)=
0, then define
φ
(
R
)=
0. Otherwise, let
u
and
v
be some positive eigenvectors for eigenvalue 1
of
R
, normalized so that
u
,
v
=
1, and define the edge expansion of
R
as
φ
(
R
)=
min
S
[
n
]
,
P
i
S
u
i
·
v
i
1
2
P
i
S
,
j
S
R
i
,
j
·
u
i
·
v
j
P
i
S
u
i
·
v
i
.
Given this definition of
φ
(
R
)
, we show the following.
Theorem 5.
Let R be a nonnegative matrix with PF eigenvalue
1
, and let u and v be any corresponding left
and right eigenvectors. Let
κ
=
min
i
u
i
·
v
i
, and if
κ>
0
, let u and v be normalized so that
u
,
v
=
1
. Then
1
30
·
1
Re
λ
2
(
R
)
n
+
ln
1
κ

φ
(
R
)
Æ
2
·
(
1
Re
λ
2
(
R
))
.
The upper bound in Theorem
4
is a straightforward extension of Fiedler’s bound (Lemma
2
) based
on the above mentioned definition of
φ
(Definition
11
). Also note that the lower bound in Theorem
4
can be obtained by setting
κ
=
1
n
in Theorem
5
. The upper bound is proven in Appendix
C
and the
lower bound is shown in Section
4
.
Since Theorem
5
gives a two-sided relation between the
second
eigenvalue of nonnegative matrices
and their edge expansion, it gives a two-sided quantitative
refinement of the Perron-Frobenius theorem.
Although the Perron-Frobenius theorem implies that the non
trivial eigenvalues of an irreducible non-
negative matrix
R
with PF eigenvalue 1 have real part strictly less than 1, it do
es not give any concrete
separation. Further, it also does not provide a qualitative
(or quantitative) implication in the other di-
rection – whether a nonnegative matrix
R
with all nontrivial eigenvalues having real part strictly l
ess
than 1
implies
that
R
is irreducible. Theorem
5
comes to remedy this by giving a lower and upper bound
on the spectral gap in terms of
φ
, a quantitative measure of the irreducibility of
R
. We are not aware of
any previous result of this form.
4
1.3 Mixing time
The third quantity we study is the mixing time of general nonn
egative matrices, and we relate it to their
singular values, edge expansion, and spectral gap. This hel
ps us obtain new bounds on mixing time,
and also obtain elementary proofs for known results. These r
esults are treated in detail in the second
part of the paper, in Section
5
.
1.4 Perspectives
1.4.1 Matrix perturbations
Let
A
be a nonnegative matrix with PF eigenvalue 1 and correspondi
ng positive left and right eigenvector
w
with
w
,
w
=
1, and let
κ
=
min
i
w
2
i
. Given such
A
, it is certainly easier to calculate its spectrum
than its edge expansion. However, in other cases, e.g., if th
e matrix is implicit in a nicely structured
randomized algorithm (as in the canonical paths method
[
SJ89
]
), the edge expansion may actually be
easier to bound. From this point of view, a lower bound on
φ
(
A
)
in terms of the spectral gap is an
eigenvalue perturbation bound.
Specifically, one might write a nonnegative matrix
A
with small edge
expansion
φ
(
A
)
as a perturbation of another nonnegative matrix
A
0
, i.e.,
A
=
A
0
+
δ
·
B
where
A
0
has disconnected components
S
,
S
c
(for
S
achieving
φ
(
A
)
), and
B
is a matrix such that
k
B
k
2
1,
Bw
=
0 and
B
T
w
=
0. Due to the conditions on
A
and
B
,
A
0
has PF eigenvalue 1 with left and right
eigenvector
w
, and since
BD
w
1
=
0, writing
1
S
=
1
1
S
, we have,
1
S
,
D
w
BD
w
1
S
1
S
,
D
w
D
w
1
S
=
1
S
,
D
w
BD
w
1
S
1
S
,
D
w
D
w
1
S
=
D
w
1
S
,
BD
w
1
S
D
w
1
S
,
D
w
1
S
≤k
B
k
2
1,
and it follows then that
φ
(
R
0
)
δ
φ
(
R
)
φ
(
R
0
)+
δ
, and since in this case
φ
(
R
0
)=
0, so
φ
(
A
)
δ
.
Edge expansion is therefore stable with respect to perturba
tion by
B
. What about the spectral gap?
A
0
has (at least) a double eigenvalue at 1, and
A
retains a simple eigenvalue at 1, so it is natural
to try to apply eigenvalue stability results, specifically B
auer-Fike (
[
BF60
]
,
[
Bha97
]
§VIII), to obtain an
upper bound on
|
1
λ
2
(
A
)
|
(and therefore also on 1
Re
λ
2
(
A
)
). However, Bauer-Fike requires
A
0
to
be diagonalizable, and the quality of the bound depends upon
the condition number of the diagonal-
izing change of basis
1
. There are extensions of Bauer-Fike which do not require dia
gonalizability, but
deteriorate exponentially in the size of the Jordan block, a
nd the bound still depends on the condition
number of the (now Jordan-izing) change of basis (
[
Saa11
]
Corollary 3.2). Since there is no a priori
(i.e., function of
n
) bound on these condition numbers, these tools unfortunate
ly do not imply any, even
weak, result analogous to Theorem
4
.
In summary, the lower bound in Theorem
4
should be viewed as a new eigenvalue perturbation
bound:
1
Re
λ
2
(
A
)
30
·
δ
·

n
+
ln

1
κ
‹‹
.
A novel aspect of this bound, in comparison with the prior lit
erature on eigenvalue perturbations, is that
it does not depend on the condition number (or even the diagon
alizability) of
A
or of
A
0
.
1
The condition number is inf
(
k
B
k·k
B
1
k
)
over
B
such that
BAB
1
is diagonal, with
k·k
being operator norm.
5
1.4.2 Relating eigenvalues of a nonnegative matrix and its a
dditive symmetrization
Let
A
be any nonnegative matrix with PF eigenvalue 1, and correspo
nding positive left and right eigen-
vector
w
. Our result helps to give the following bounds between the se
cond eigenvalue of a nonnegative
matrix
A
and that of its
additive symmetrization
1
2
(
A
+
A
T
)
.
Lemma 6.
Let A be any nonnegative matrix with PF eigenvalue 1 and correspondi
ng positive left and right
eigenvector w, let
κ
=
min
i
w
2
i
and M
=
1
2
(
A
+
A
T
)
.Then
1
1800
‚
1
Re
λ
2
(
A
)
n
+
ln
1
κ

Œ
2
1
λ
2
(
M
)
1
Re
λ
2
(
A
)
.
The bounds immediately follows from Theorem
5
and the fact that
φ
is unchanged by additive
symmetrization. We remark that any improved lower bound on 1
λ
2
(
M
)
in terms of 1
Re
λ
2
(
A
)
will
help to improve the lower bound in Theorem
5
. As a consequence of Lemma
6
, any bound based on the
second eigenvalue of symmetric nonnegative matrices can be
applied, with dimension-dependent loss,
to nonnegative matrices that have identical left and right e
igenvector for the PF eigenvalue.
An example application of Lemma
6
is the following. For some doubly stochastic
A
(not necessarily
symmetric), consider the continuous time Markov Chain asso
ciated with it, exp
(
t
·
(
A
I
))
. It is well-
known (for instance,
[
DSC96
]
) that for any standard basis vector
x
i
,
exp
(
t
·
(
A
I
))
x
i
1
n
1
2
1
n
·
exp
(
2
·
(
1
λ
2
(
M
))
·
t
)
.
Thus, using Lemma
6
, we immediately get the bound in terms of the second eigenval
ue of
A
itself
(instead of its additive symmetrization),
exp
(
t
·
(
A
I
))
x
i
1
n
1
2
1
n
·
exp
1
1800
‚
1
Re
λ
2
(
A
)
n
+
ln
1
κ

Œ
2
·
t
!
.
1.4.3 The role of singular values
Although in the symmetric case singular values are simply th
e absolute values of the eigenvalues, the
two sets can be much less related in the nonsymmetric case. It
is not difficult to show the following (see
Appendix
A
).
Lemma 7.
Let A be a nonnegative matrix with PF eigenvalue 1, and let w be the posi
tive vector such that
Aw
=
w and A
T
w
=
w. Then
1
σ
2
(
A
)
2
φ
(
A
)
,
where
σ
2
(
A
)
is the second largest singular value of A.
Proof.
The proof is given in Appendix
A
.
Despite appearances, this tells us little about edge expans
ion. To see this, consider the directed
cycle on
n
vertices, for which every singular value (and in particular
σ
2
) is 1, so Lemma
7
gives a lower
bound of 0 for
φ
, although
φ
=
2
/
n
. A meaningful lower bound should be 0 if and only if the graph
is disconnected, i.e. it should be continuous in
φ
. An even more striking example is that of de Bruijn
6
graphs (described in Section
3.1
), for which half the singular values are 1, although
φ
=
Θ
(
1
/
log
n
)
.
Eigenvalues, on the other hand, are more informative, since
a nonnegative matrix with PF eigenvalue 1
has a multiple
eigenvalue
at 1 if and only if, as a weighted graph, it is disconnected.
Despite these observations, singular values can be useful t
o infer information about mixing time,
and can be used to recover all known upper bounds on mixing tim
e using
φ
, as discussed in Section
5
and Lemma
24
.
Outline: We state the preliminary definitions, theorems, an
d notations in Section
2
. We give the
construction for the upper bound on
Γ
in Theorem
4
in Section
3
, and the lower bound on
Γ
will follow
from the general lower bound on
φ
in Theorem
5
. We show the upper bound on
φ
in Theorem
5
in
Appendix
C
, and show the lower bound on
φ
in Section
4
. We relate mixing time to singular values,
edge expansion, and the spectral gap in Section
5
respectively. We defer all proofs to the Appendix.
2 Preliminaries
We consider
doubly stochastic
matrices
A
R
n
×
n
which are nonnegative matrices with entries in every
row and column summing to 1. We also consider general nonnega
tive matrices
R
, and say that
R
is
strongly connected
or
irreducible
, if there is a path from
s
to
t
for every pair
(
s
,
t
)
of vertices in the
underlying digraph on edges with positive weight, i.e. for e
very
(
s
,
t
)
there exists
k
>
0 such that
R
k
(
s
,
t
)
>
0. We say
R
is
weakly connected
, if there is a pair of vertices
(
s
,
t
)
such that there is a path
from
s
to
t
but no path from
t
to
s
in the underlying digraph (on edges with positive weight). W
e restate
the Perron-Frobenius theorem for convenience.
Theorem 8.
(Perron-Frobenius theorem
[
Per07
,
Fro12
]
)
Let R
R
n
×
n
be a nonnegative matrix. Then
the following hold for R.
1. R has some nonnegative eigenvalue r, such that all other eigenvalues h
ave magnitude at most r, and
R has nonnegative left and right eigenvectors u and v for r.
2. If R has some
positive
left and right eigenvectors u and v for some eigenvalue
λ
, then
λ
=
r.
3. If R is irreducible, then r is positive and simple (unique), u and v
are positive and unique, and all
other eigenvalues have real part strictly less than r.
We denote the all 1’s vector by
1
, and note that for a doubly stochastic matrix
A
,
1
is both the left and
right eigenvector with eigenvalue 1. We say that 1 is the
trivial (or stochastic) eigenvalue
, and
1
is the
trivial (or stochastic) eigenvector,
of
A
. All other eigenvalues of
A
(corresponding to eigenvectors that are
not trivial) will be called
nontrivial (or nonstochastic) eigenvalues
of
A
. Similarly, by Perron-Frobenius
(Theorem
8
, part 1), a nonnegative matrix
R
will have a simple nonnegative eigenvalue
r
such that all
eigenvalues have magnitude at most
r
, and it will be called the
trivial
or PF eigenvalue of
R
, and all
other eigenvalues of
R
will be called
nontrivial.
The left and right eigenvectors corresponding to
r
will
be called the
trivial or PF left eigenvector
and
trivial or PF right eigenvector
. This leads us to the following
definition.
Definition 9.
(Second eigenvalue)
If
A
is a doubly stochastic matrix, then
λ
2
(
A
)
is the nontrivial eigen-
value of
A
with the maximum real part, and
λ
m
(
A
)
is the nontrivial eigenvalue that is largest in magni-
tude. Similarly, if
R
is any general nonnegative matrix with PF eigenvalue 1, then
λ
2
(
R
)
is the nontrivial
7
eigenvalue with the maximum real part, i.e., 1
=
λ
1
(
R
)
Re
λ
2
(
R
)
≥···≥
Re
λ
n
(
R
)
≥−
1, and
λ
m
(
R
)
is the nontrivial eigenvalue that is largest in magnitude.
We will also consider singular values of nonnegative matric
es
A
with identical positive left and right
eigenvector
w
for PF eigenvalue 1, and denote them as 1
=
σ
1
(
A
)
σ
2
(
A
)
≥···≥
σ
n
(
A
)
0 (see
Lemma
20
for proof of
σ
1
(
A
)=
1). We denote
(
i
,
j
)
’th entry of
M
C
n
×
n
by
M
(
i
,
j
)
or
M
i
,
j
depending
on the importance of indices in context, denote the conjugat
e-transpose of
M
as
M
and the transpose of
M
as
M
T
. Any
M
C
n
×
n
has a
Schur decomposition
(see, e.g.,
[
Lax07
]
)
M
=
UTU
where
T
is an upper
triangular matrix whose diagonal entries are the eigenvalu
es of
M
, and
U
is a unitary matrix. When we
write “vector” we mean by default a column vector. For a vecto
r
v
, we write
v
(
i
)
or
v
i
to denote its
i
’th
entry. For any two vectors
x
,
y
C
n
, we use the standard
inner product
x
,
y
=
P
n
i
=
1
x
i
·
y
i
defining
the norm
k
x
k
2
=
p
x
,
x
. We write
u
v
to indicate that
u
,
v
=
0. Note that
x
,
M y
=
M
x
,
y
.
We denote the operator norm of
M
by
k
M
k
2
=
max
u
:
k
u
k
2
=
1
k
Mu
k
2
, and recall that the operator norm
is at most the Frobenius norm, i.e.,
k
M
k
2
q
P
i
,
j
|
M
i
,
j
|
2
. We write
D
u
for the diagonal matrix whose
diagonal contains the vector
u
. Recall the Courant-Fischer variational characterizatio
n of eigenvalues
for symmetric real matrices, applied to the second eigenval
ue:
max
u
v
1
u
,
Mu
u
,
u
=
λ
2
(
M
)
,
where
v
1
is the eigenvector for the largest eigenvalue of
M
. We will use the symbol
J
for the all 1’s
matrix divided by
n
, i.e.,
J
=
1
n
1
·
1
T
.
We say that any subset
S
[
n
]
is a
cut
, denote its complement by
S
, and denote the
characteristic
vector of a cut
as
1
S
, where
1
S
(
i
)=
1 if
i
S
and 0 otherwise.
Definition 10.
(Edge expansion of doubly stochastic matrices)
For a doubly stochastic matrix
A
, the
edge
expansion of the cut S
is defined as
φ
S
(
A
)
:
=
1
S
,
A
1
S
min
{〈
1
S
,
A
1
,
1
S
,
A
1
〉}
and the
edge expansion of A
is defined as
φ
(
A
)=
min
S
[
n
]
φ
S
(
A
)=
min
S
[
n
]
1
S
,
A
1
S
min
{〈
1
S
,
A
1
,
1
S
,
A
1
〉}
=
min
S
,
|
S
|≤
n
/
2
P
i
S
,
j
S
A
i
,
j
|
S
|
.
We wish to extend these notions to general nonnegative matri
ces
R
. Since eigenvalues and singular
values of real matrices remain unchanged whether we conside
r
R
or
R
T
, the same should hold of a
meaningful definition of edge expansion. However, note that
Definition
10
has this independence only
if the matrix is Eulerian, i.e.,
R
1
=
R
T
1
. Thus, to define edge expansion for general matrices, we
transform
R
using its left and right eigenvectors
u
and
v
for eigenvalue 1 to obtain
D
u
RD
v
, which is
indeed Eulerian, since
D
u
RD
v
1
=
D
u
Rv
=
D
u
v
=
D
u
D
v
1
=
D
v
D
u
1
=
D
v
u
=
D
v
R
T
u
=
D
v
R
T
D
u
1
.
Since
D
u
RD
v
is Eulerian, we can define the edge expansion of
R
similar to that for doubly stochastic
matrices:
8
Definition 11.
(Edge expansion of nonnegative matrices)
Let
R
R
n
×
n
be a nonnegative matrix with PF
eigenvalue 1. If there are no positive (i.e.,
everywhere
positive) left and right eigenvectors
u
and
v
for
eigenvalue 1, then define the edge expansion
φ
(
R
) =
0. Else, let
u
and
v
be any (see Lemma
12
for
justification) positive left and right eigenvectors for eig
envalue 1, normalized so that
u
,
v
=
1. The
edge expansion of the cut S
is defined as
φ
S
(
R
)
:
=
1
S
,
D
u
RD
v
1
S
min
{〈
1
S
,
D
u
RD
v
1
,
1
S
,
D
u
RD
v
1
〉}
(2)
and the
edge expansion of R
is defined as
φ
(
R
)=
min
S
[
n
]
φ
S
(
R
)=
min
S
[
n
]
,
P
i
S
u
i
·
v
i
1
2
1
S
,
D
u
RD
v
1
S
1
S
,
D
u
D
v
1
=
min
S
[
n
]
,
P
i
S
u
i
·
v
i
1
2
P
i
S
,
j
S
R
i
,
j
·
u
i
·
v
j
P
i
S
u
i
·
v
i
.
Lemma 12.
Let R
R
n
×
n
be a nonnegative matrix with PF eigenvalue
1
. Then the value of
φ
(
R
)
according
to Definition
11
is independent of the choice of the specific (left and right) eigenvecto
rs u and v for eigenvalue
1 of R.
Proof.
Let
G
be the underlying unweighted directed graph for
R
, where there is an edge
(
u
,
v
)
in
G
if
and only if
R
u
,
v
>
0. We prove the lemma based on the structure of
G
. Let
G
be maximally partitioned
into
k
weakly connected components.
1. If
G
has some weakly connected component which does
not
have a 1 eigenvalue, then for any pair
of left and right eigenvectors
u
and
v
for eigenvalue 1, there will be at least one entry
i
such that
u
i
=
0 or
v
i
=
0. For such matrices
R
, from Definition
11
,
φ
(
R
)=
0.
2. If all weakly connected components of
G
have eigenvalue 1, but there is some weakly connected
component
S
that is not strongly connected, then there is no positive pai
r of eigenvectors
u
and
v
for
R
, or even for
S
. Observe that
S
=

A B
0
C

with
B
6
=
0, else
G
is not maximally partitioned.
For the sake of contradiction, let
x
and
y
be positive left and right eigenvectors for eigenvalue
1 of
S
. From
S y
=
y
and
S
T
x
=
x
, we get that
Ay
1
+
By
2
=
y
1
and
A
T
x
1
=
x
1
with
x
and
y
partitioned into
x
1
,
x
2
and
y
1
,
y
2
based on the sizes of
A
,
B
,
C
. Thus,
x
1
,
y
1
=
x
1
,
Ay
1
+
By
2
=
A
T
x
1
,
y
1
+
x
1
,
By
2
=
x
1
,
y
1
+
x
1
,
By
2
implying
x
1
,
By
2
=
0 which is a contradiction since
x
1
and
y
1
are positive and there is some
entry in
B
which is not 0. Thus, since every pair of eigenvectors
x
and
y
for eigenvalue 1 of
S
has
some entry
i
with
x
i
=
0 or
y
i
=
0, every pair of (left and right) eigenvectors
u
and
v
for
R
(for
eigenvalue 1) has some entry
i
which is 0, and so by Definition
11
,
φ
(
R
)=
0.
3. If
G
consists of
one
strongly connected component (i.e.,
k
=
1), then by Perron-Frobenius (The-
orem
8
, part 3), there is a
unique
(up to scaling) pair of positive eigenvectors
u
and and
v
for
eigenvalue 1.
4. The remaining case is that
G
has
k
2 strongly connected components each with eigenvalue 1.
By Perron-Frobenius (Theorem
8
), there is some pair of positive left and right eigenvectors
u
and
9
v
for eigenvalue 1 (obtained by concatenating the positive le
ft and right eigenvectors for each
component individually). Choose the set
S
to be one of the components (the one for which the
denominator of equation (
2
) is at most
1
2
), then the numerator of equation (
2
) will be 0, and thus
φ
(
R
) =
0 even in this case, corresponding to the existence of a stric
t subset of vertices with no
expansion.
Thus, Definition
11
for
φ
(
R
)
does not depend on the specific choice of
u
and
v
.
The Perron-Frobenius theorem (Theorem
8
, part 3) can now be restated in terms of
φ
(
R
)
and
Re
λ
2
(
R
)
as follows.
Lemma 13.
(Perron-Frobenius, part 3 of Theorem
8
, restated) Let R be a nonnegative matrix with PF
eigenvalue 1. If
φ
(
R
)
>
0
, then
Re
λ
2
(
R
)
<
1
.
Further, we obtain the following converse of Lemma
13
.
Lemma 14.
(Converse of Lemma
13
) Let R be a nonnegative matrix with PF eigenvalue 1. If
Re
λ
2
(
R
)
<
1
,
and
there exists a pair of positive left and right eigenvectors u and v for
eigenvalue 1 of R, then
φ
(
R
)
>
0
.
Proof.
We show the contrapositive. Let
R
be as stated and let
φ
(
R
) =
0. From Definition
11
, if there
are no positive eigenvectors
u
and
v
for eigenvalue 1, the lemma holds. So assume there are some
positive
u
and
v
for eigenvalue 1 of
R
. Since
φ
(
R
) =
0, there is some set
S
for which
φ
S
(
R
) =
0, or
P
i
S
,
j
S
R
i
,
j
·
u
i
·
v
j
=
0. But since
u
i
>
0 and
v
i
>
0 for each
i
, and since
R
is nonnegative, it implies
that for each
i
S
,
j
S
,
R
i
,
j
=
0. Further, since
D
u
RD
v
is Eulerian, i.e.
D
u
RD
v
1
=
D
v
R
T
D
u
1
, it implies
that
P
i
S
,
j
S
R
i
,
j
·
u
i
·
v
j
=
0, further implying that
R
i
,
j
=
0 for each
i
S
,
j
S
. As a consequence,
v
can
be rewritten as two vectors
v
S
and
v
S
, where
v
S
(
i
)=
v
(
i
)
if
i
S
and
v
S
(
i
)=
0 otherwise, and similarly
v
S
. Similarly, split
u
into
u
S
and
u
S
. Note that
v
S
and
v
S
are linearly independent (in fact, orthogonal),
and both are right eigenvectors for eigenvalue 1 (similarly
u
S
and
u
S
as left eigenvectors). Thus, this
implies that eigenvalue 1 for
R
has multiplicity at least 2, and thus
λ
2
(
R
)=
1, as required.
The upper and lower bounds on
φ
(
R
)
in Theorem
5
are
quantitative
versions of Lemmas
13
and
14
respectively.
We note that Cheeger’s inequalities hold not only for any sym
metric doubly stochastic matrix, but
also for any nonnegative matrix
R
which satisfies detailed balance. We say that a nonnegative m
atrix
R
with positive left and right eigenvectors
u
and
v
for PF eigenvalue 1 satisfies
detailed balance
if
D
u
RD
v
is symmetric, which generalizes the usual definition of deta
iled balance (or reversibility) for stochastic
matrices. We first note that if
R
satisfies the condition of detailed balance, then
R
has all real eigenvalues.
To see this, let
W
=
D
1
2
u
D
1
2
v
where the inverses are well-defined since we assume
u
and
v
are positive
(else
φ
=
0 by definition), and
A
=
WRW
1
where
A
has same eigenvalues as
R
. For
w
=
D
1
2
u
D
1
2
v
1
which
is both the left and right eigenvector of
A
for eigenvalue 1, the detailed balance condition translate
s to
D
w
AD
w
=
D
w
A
T
D
w
, which implies
A
=
A
T
, which further implies that all eigenvalues of
A
(and
R
) are
real. We can thus state Cheeger inequalities for nonnegativ
e matrices satisfying detailed balance.
Theorem 15.
(Cheeger’s inequalities for nonnegative matrices satisfy
ing detailed balance)
Let R be a
nonnegative matrix with PF eigenvalue 1 and positive left and right
eigenvectors u and v (else
φ
(
R
)=
0
by
definition), and let R satisfy the condition of detailed balance, i.e. D
u
RD
v
=
D
v
R
T
D
u
. Then
1
λ
2
(
R
)
2
φ
(
R
)
Æ
2
·
(
1
λ
2
(
R
))
.
10
3 Construction of doubly stochastic matrices with small edg
e expansion
and large spectral gap
As discussed, it might have been tempting to think that
Γ
(
n
)
(see Definition
3
) should be independent
of the matrix size
n
, since this holds for the symmetric case, and also for the upp
er bound on
φ
for the
general nonsymmetric doubly stochastic case (Lemma
2
). However, two known examples showed that
a mild dependence on
n
cannot be avoided.
3.1 Known Constructions
Klawe-Vazirani construction:
The first is a construction of Klawe
[
Kla84
]
– these families of
d
-regular
undirected graphs have edge expansion
(
log
n
)
γ
for various 0
< γ <
1. However, there is a natural
way in which to direct the edges of these graphs and obtain a
(
d
/
2
)
-in,
(
d
/
2
)
-out -regular graph, and
it was noted by U. Vazirani
[
Vaz17
]
in the 1980s that for this digraph
A
, which shares the same edge
expansion as Klawe’s, all eigenvalues (except the stochast
ic eigenvalue) have norm
1
/
2. Specifically,
one construction is as follows: let
n
be an odd prime, and create the graph on
n
vertices with in-degree
and out-degree 2 by connecting every vertex
v
Z
/
n
to two vertices, 1
+
v
and 2
v
. Dividing the adjacency
matrix of the graph by 2 gives a doubly stochastic matrix
A
KV
. It is simple to see (by transforming to
the Fourier basis over
Z
/
n
) that the characteristic polynomial of
A
KV
is
x
(
x
1
)((
2
x
)
n
1
1
)
/
(
2
x
1
)
,
so apart from the trivial eigenvalue 1,
A
KV
has
n
2 nontrivial eigenvalues
λ
such that
|
λ
|
=
1
2
and one
eigenvalue 0, and thus, Re
λ
2
(
A
KV
)
1
2
. Further, upper bounding the edge expansion
φ
by the vertex
expansion bound (Theorem 2.1 in
[
Kla84
]
), it follows that for some constant
c
,
Γ
(
n
)
φ
(
A
KV
)
1
Re
λ
2
(
A
KV
)
c
·

loglog
n
log
n
‹
1
/
5
.
de Bruijn construction:
A second example is the de Bruijn digraph
[
Bru46
]
. This is if anything even
more striking: the doubly stochastic matrix (again represe
nting random walk along an Eulerian digraph)
has edge expansion
Θ
(
1
/
log
n
)
[
DT98
]
, yet
all
the nontrivial eigenvalues are 0. More specifically, define
a special case of de Bruijn
[
Bru46
]
graphs as follows: Let
n
=
2
k
for some integer
k
, and create the graph
of degree 2 on
n
vertices by directing edges from each vertex
v
=(
v
1
,
v
2
,...,
v
k
)
∈{
0,1
}
k
to two vertices,
(
v
2
,
v
3
,...,
v
k
,0
)
and
(
v
2
,
v
3
,...,
v
k
,1
)
. Dividing the adjacency matrix of the graph by 2 gives a doubl
y
stochastic matrix
A
dB
. Since this random walk completely forgets its starting poi
nt after
k
steps, every
nontrivial eigenvalue of
A
dB
is 0 (and each of its Jordan blocks is of dimension at most
k
). Further, it
was shown in
[
DT98
]
that
φ
(
A
dB
)
Θ
(
1
/
k
)
, and thus,
Γ
(
n
)
φ
(
A
dB
)
1
Re
λ
2
(
A
dB
)
1
k
=
1
log
n
.
Other literature – Feng-Li construction:
We round out this discussion by recalling that Alon and Bop-
pana
[
Alo86
,
Nil91
]
showed that for any infinite family of
d
-regular undirected graphs, the adjacency
matrices, normalized to be doubly stochastic and with eigen
values 1
=
λ
1
λ
2
...
≥ −
1, have
λ
2
2
p
d
1
d
o
(
1
)
. Feng and Li
[
Li92
]
showed that undirectedness is essential to this bound: they
provide a construction of cyclically-directed
r
-partite (
r
2)
d
-regular digraphs (with
n
=
kr
vertices
for
k
>
d
, gcd
(
k
,
d
) =
1), whose normalized adjacency matrices have (apart from
r
“trivial” eigenval-
ues), only eigenvalues of norm
1
/
d
. The construction is of an affine-linear nature quite simila
r to the
preceding two, and to our knowledge does not give an upper bou
nd on
Γ
any stronger than those.
11
3.2 A new construction
To achieve the upper bound in Theorem
4
, we need a construction that is
exponentially
better than
known examples. We give an explicit construction of
n
×
n
doubly stochastic matrices
A
n
(for every
n
)
that are highly
nonexpanding
, since they contain sets with edge expansion less than 1
/
p
n
, even though
every
nontrivial eigenvalue is 0. The construction might seem non
intuitive, but in Appendix
F
we give
some explanation of how to arrive at it.
Theorem 16.
Let m
=
p
n,
a
n
=
m
2
+
m
1
m
·
(
m
+
2
)
,
b
n
=
m
+
1
m
·
(
m
+
2
)
,
c
n
=
1
m
·
(
m
+
1
)
,
d
n
=
m
3
+
2
m
2
+
m
+
1
m
·
(
m
+
1
)
·
(
m
+
2
)
,
e
n
=
1
m
·
(
m
+
1
)
·
(
m
+
2
)
,
f
n
=
2
m
+
3
m
·
(
m
+
1
)
·
(
m
+
2
)
,
and define the n
×
n matrix
A
n
=
a
n
b
n
0 0 0 0 0
···
0
0
c
n
d
n
e
n
e
n
e
n
e
n
···
e
n
0
c
n
e
n
d
n
e
n
e
n
e
n
···
e
n
0
c
n
e
n
e
n
d
n
e
n
e
n
···
e
n
0
c
n
e
n
e
n
e
n
d
n
e
n
···
e
n
0
c
n
e
n
e
n
e
n
e
n
d
n
···
e
n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0
c
n
e
n
e
n
e
n
e
n
e
n
...
d
n
b
n
f
n
c
n
c
n
c
n
c
n
c
n
···
c
n
.
Then the following hold for A
n
:
1. A
n
is doubly stochastic.
2. Every nontrivial eigenvalue of A
n
is 0.
3. The edge expansion is bounded as
1
6
p
n
φ
(
A
n
)
1
p
n
.
4. As a consequence of 1,2,3,
φ
(
A
n
)
1
Re
λ
2
(
A
n
)
p
n
and thus
Γ
(
n
)
1
p
n
.
Proof.
The proof is given in Appendix
B
.
12
This completes the proof of the upper bound in Theorem
4
. We remark that for the matrices
A
n
constructed in Theorem
16
, it holds that
φ
(
A
n
)
1
−|
λ
i
(
A
)
|
p
n
for any
i
6
=
1, giving a stronger guarantee than that required for Theore
m
4
.
We also remark that it would be unlikely to arrive at such a con
struction by algorithmic simulation,
since the eigenvalues of the matrices
A
n
are extremely sensitive. Although
λ
2
(
A
n
) =
0, if we shift
only
O
(
1
/
p
n
)
of the mass in the matrix
A
n
to create a matrix
A
n
, by replacing
a
n
with
a
n
=
a
n
+
b
n
,
b
n
with
b
n
=
0,
f
n
with
f
n
=
f
n
+
b
n
and keeping
c
n
,
d
n
,
e
n
the same, then
λ
2
(
A
n
) =
1. Thus, since
perturbations of
O
(
1
/
p
n
)
(which is tiny for large
n
) cause the second eigenvalue to jump from 0 to 1
(and the spectral gap from 1 to 0), it would not be possible to m
ake tiny changes to random matrices
to arrive at a construction satisfying the required propert
ies in Theorem
16
.
4 Lower bound on the edge expansion
φ
in terms of the spectral gap
In this section, we prove the lower bound on
φ
in Theorem
5
, and the lower bound on
φ
in Theorem
4
will follow as a special case. The proof is a result of a sequen
ce of lemmas that we state next. The first
lemma states that
φ
is sub-multiplicative in the following sense.
Lemma 17.
Let R
R
n
×
n
be a nonnegative matrix with left and right eigenvectors u and v for t
he PF
eigenvalue 1. Then
φ
(
R
k
)
k
·
φ
(
R
)
.
Proof.
The proof is given in Appendix
D.1
.
For the case of symmetric doubly stochastic matrices
R
, Lemma
17
follows from a theorem of Blakley
and Roy
[
BR65
]
. (It does not fall into the framework of an extension of that r
esult to the nonsymmetric
case
[
Pat12
]
). Lemma
17
helps to lower bound
φ
(
R
)
by taking powers of
R
, which is useful since we
can take sufficient powers in order to make the matrix simple e
nough that its edge expansion is easily
calculated. The next two lemmas follow by technical calcula
tions.
Lemma 18.
Let T
C
n
×
n
be an upper triangular matrix with
k
T
k
2
=
σ
and for every i,
|
T
i
,
i
|≤
β
. Then
k
T
k
k
2
n
·
σ
n
·

k
+
n
n
‹
·
β
k
n
.
Proof.
The proof is given in Appendix
D.2
.
Using Lemma
18
, we can show the following lemma for the special case of upper
triangular matrices
with operator norm at most 1.
Lemma 19.
Let T
C
n
×
n
be an upper triangular matrix with
k
T
k
2
1
and
|
T
i
,
i
|≤
α <
1
for every i.
Then
k
T
k
k≤
ε
for
k
4
n
+
2ln
(
n
ε
)
1
α
.
Proof.
The proof is given in Appendix
D.3
.
13
Given lemmas
17
and
19
, we can lower bound
φ
(
R
)
in terms of 1
−|
λ
m
(
R
)
|
(where
λ
m
is the
nontrivial eigenvalue that is maximum in magnitude). Our ai
m is to lower bound
φ
(
R
)
by
φ
(
R
k
)
, but
since the norm of
R
k
increases by powering, we cannot use the lemmas directly, si
nce we do not want
a dependence on
σ
(
R
)
in the final bound. To handle this, we transform
R
to
A
, such that
φ
(
R
)=
φ
(
A
)
,
the eigenvalues of
R
and
A
are the same, but
σ
(
A
)=
k
A
k
2
=
1 irrespective of the norm of
R
.
Lemma 20.
Let R be a nonnegative matrix with positive (left and right) eigenvec
tors u and v for the PF
eigenvalue 1, normalized so that
u
,
v
=
1
. Define A
=
D
1
2
u
D
1
2
v
RD
1
2
u
D
1
2
v
. Then the following hold for A:
1.
φ
(
A
)=
φ
(
R
)
.
2. For every i,
λ
i
(
A
)=
λ
i
(
R
)
.
3.
k
A
k
2
=
1
.
Proof.
The proof is given in Appendix
D.4
.
Given Lemma
20
, we lower bound
φ
(
A
)
using
φ
(
A
k
)
in terms of 1
−|
λ
m
(
A
)
|
, to obtain the corre-
sponding bounds for
R
.
Lemma 21.
Let R be a nonnegative matrix with positive (left and right) eigenvec
tors u and v for the PF
eigenvalue 1, normalized so that
u
,
v
=
1
. Let
λ
m
be the nontrivial eigenvalue of R that is maximum in
magnitude and let
κ
=
min
i
u
i
·
v
i
. Then
1
20
·
1
−|
λ
m
|
n
+
ln

1
κ
‹
φ
(
R
)
.
Proof.
The proof is given in Appendix
D.5
.
Given Lemma
21
, we use the trick of lazy random walks to get a bound on 1
Re
λ
2
(
R
)
from a bound
on 1
−|
λ
m
(
R
)
|
.
Lemma 22.
Let R be a nonnegative matrix with positive (left and right) eigenvec
tors u and v for the PF
eigenvalue 1, normalized so that
u
,
v
=
1
. Let
κ
=
min
i
u
i
·
v
i
. Then
1
30
·
1
Re
λ
2
(
R
)
n
+
ln
1
κ

φ
(
R
)
.
For any doubly stochastic matrix A,
1
Re
λ
2
(
A
)
35
·
n
φ
(
A
)
,
and thus
1
35
·
n
Γ
(
n
)
.
Proof.
The proof is given in Appendix
D.6
.
This completes the proof of the lower bound on
φ
in Theorem
5
, and the upper bound on
φ
in The-
orem
5
is shown in Appendix
C
. Combined with Theorem
16
, this also completes the proof of Theorem
4
.
14
5 Mixing time
We now study the mixing time of nonnegative matrices, and rel
ate it to all the quantities we have studied
so far. To motivate the definition of mixing time for general n
onnegative matrices, we first consider the
mixing time of doubly stochastic matrices. The mixing time o
f a doubly stochastic matrix
A
(i.e., of
the underlying Markov chain) is the worst-case number of ste
ps required for a random walk starting
at any vertex to reach a distribution approximately uniform
over the vertices. To avoid complications
of periodic chains, we assume that
A
is
1
2
-lazy, meaning that for every
i
,
A
i
,
i
1
2
. Given any doubly
stochastic matrix
A
, it can be easily converted to the lazy random walk
1
2
I
+
1
2
A
. This is still doubly
stochastic and in the conversion both
φ
(
A
)
and the spectral gap are halved. The mixing time will be
finite provided only that the chain is connected. Consider th
e indicator vector
1
{
i
}
for any vertex
i
. We
want to find the smallest
τ
such that
A
τ
1
{
i
}
1
n
1
or
A
τ
1
{
i
}
1
n
1
0, which can further be written as
A
τ
1
n
1
·
1
T

1
{
i
}
0. Concretely, for any
ε
, we want to find
τ
=
τ
ε
(
A
)
such that for any
i
,

A
τ
1
n
1
·
1
T
‹
1
{
i
}
1
ε
.
Given such a value of
τ
, for any vector
x
such that
k
x
k
1
=
1, we get

A
τ
1
n
1
·
1
T
‹
x
1
=
X
i

A
τ
1
n
1
·
1
T
‹
x
i
1
{
i
}
1
X
i
|
x
i
|

A
τ
1
n
1
·
1
T
‹
1
{
i
}
1
X
i
|
x
i
ε
=
ε
.
Thus, the mixing time
τ
ε
(
A
)
is the number
τ
for which
k
(
A
τ
J
)
·
x
k
1
ε
for any
x
such that
k
x
k
1
=
1.
We want to extend this definition to any nonnegative matrix
R
with PF eigenvalue 1 and correspond-
ing positive left and right eigenvectors
u
and
v
. Note that if
R
is reducible (i.e.,
φ
(
R
) =
0), then the
mixing time is infinite. Further, if
R
is periodic, then mixing time is again ill-defined. Thus, we a
gain
assume that
R
is irreducible and
1
2
-lazy, i.e.
R
i
,
i
1
2
for every
i
. Let
x
be any nonnegative vector for
the sake of exposition, although our final definition will not
require nonnegativity and will hold for any
x
. We want to find
τ
such that
R
τ
x
about the same as the component of
x
along the direction of
v
.
Further, since we are right-multiplying and want convergen
ce to the right eigenvector
v
, we will define
the
1
-norm using the left eigenvector
u
. Thus, for the starting vector
x
, instead of requiring
k
x
k
1
=
1 as
in the doubly stochastic case, we will require
k
D
u
x
k
1
=
1. Since
x
is nonnegative,
k
D
u
x
k
1
=
u
,
x
=
1.
Thus, we want to find
τ
such that
R
τ
x
v
, or
R
τ
v
·
u
T

x
0. Since we measured the norm of the
starting vector
x
with respect to
u
, we will also measure the norm of the final vector
R
τ
v
·
u
T

x
with
respect to
u
. Thus we arrive at the following definition.
Definition 23.
(Mixing time of general nonnegative matrices
R
) Let
R
be a
1
2
-lazy, irreducible nonnega-
tive matrix with PF eigenvalue 1 with
u
and
v
as the corresponding positive left and right eigenvectors,
where
u
and
v
are normalized so that
u
,
v
=
k
D
u
v
k
1
=
1. Then the mixing time
τ
ε
(
R
)
is the smallest
number
τ
such that
D
u
R
τ
v
·
u
T

x
1
ε
for every vector
x
with
k
D
u
x
k
1
=
1.
We remark that similar to the doubly stochastic case, using t
he triangle inequality, it is sufficient to
find mixing time of standard basis vectors
1
{
i
}
. Let
y
i
=
1
{
i
}
k
D
u
1
{
i
}
k
1
, then
y
i
is nonnegative,
k
D
u
y
i
k
1
=
u
,
y
i
=
1, then for any
x
, such that
k
D
u
x
k
1
=
1, we can write
x
=
X
i
c
i
1
{
i
}
=
X
i
c
i
k
D
u
1
{
i
}
k
1
y
i
15
with
k
D
u
x
k
1
=
D
u
X
i
c
i
1
{
i
}
1
=
X
i
|
c
i
|k
D
u
1
{
i
}
k
1
=
1.
Thus, if for every
i
,
D
u
R
τ
v
·
u
T

y
i
1
ε
, then
D
u
R
τ
v
·
u
T

x
1
=
D
u
R
τ
v
·
u
T

X
i
c
i
k
D
u
1
{
i
}
k
1
y
i
1
X
i
c
i
k
D
u
1
{
i
}
k
1
D
u
R
τ
v
·
u
T

y
i
1
ε
.
Thus, it is sufficient to find mixing time for every nonnegativ
e
x
with
k
D
u
x
k
1
=
u
,
x
=
1, and it will
hold for all
x
.
For the case of
reversible
nonnegative matrices
M
with PF eigenvalue 1, the mixing time is well-
understood, and it is easily shown that
τ
ε
(
M
)
ln
n
κ
·
ε

1
λ
2
(
M
)
(Theorem
1
)
2
·
ln
n
κ
·
ε

φ
2
(
M
)
.
(3)
We will give corresponding bounds for the mixing time of gene
ral nonnegative matrices.
5.1 Mixing time and singular values
We first show a simple lemma relating the mixing time of nonneg
ative matrices to the second singular
value. This lemma is powerful enough to recover the bounds ob
tained by Fill
[
Fil91
]
and Mihail
[
Mih89
]
in an elementary way. Since the largest singular value of any
general nonnegative matrix
R
with PF
eigenvalue 1 could be much larger than 1, the relation betwee
n mixing time and second singular value
makes sense only for nonnegative matrices with the same left
and right eigenvector for eigenvalue 1,
which have largest singular value 1 by Lemma
20
.
Lemma 24.
(Mixing time and second singular value) Let A be a nonnegative matri
x (not necessarily lazy)
with PF eigenvalue 1, such that Aw
=
w and A
T
w
=
w for some w with
w
,
w
=
1
, and let
κ
=
min
i
w
2
i
.
Then for every c
>
0
,
τ
ε
(
A
)
c
·
ln
€
p
n
p
κ
·
ε
Š
1
σ
c
2
(
A
)
c
·
ln
n
κ
·
ε

1
σ
c
2
(
A
)
.
Proof.
The proof is given in Appendix
E.1
.
For the case of
c
=
2, Lemma
24
was obtained by Fill
[
Fil91
]
, but we find our proof simpler.
5.2 Mixing time and edge expansion
We now relate the mixing time of general nonnegative matrice
s
R
to its edge expansion
φ
(
R
)
. The upper
bound for row stochastic matrices
R
in terms of
φ
(
R
)
were obtained by Mihail
[
Mih89
]
and simplified
by Fill
[
Fil91
]
using Lemma
24
for
c
=
2. Thus, the following lemma is not new, but we prove it in
Appendix
E.2
for completeness, since our proof is simpler and holds for an
y nonnegative matrix
R
.
16