arXiv:1909.12497v1 [math.CO] 27 Sep 2019
Edge Expansion and Spectral Gap of Nonnegative Matrices
Jenish C. Mehta
California Institute of Technology
jenishc@gmail.com
Leonard J. Schulman
California Institute of Technology
schulman@caltech.edu
Abstract
The classic graphical Cheeger inequalities state that if
M
is an
n
×
n symmetric
doubly stochastic
matrix, then
1
−
λ
2
(
M
)
2
≤
φ
(
M
)
≤
Æ
2
·
(
1
−
λ
2
(
M
))
where
φ
(
M
)=
min
S
⊆
[
n
]
,
|
S
|≤
n
/
2
1
|
S
|
P
i
∈
S
,
j
6∈
S
M
i
,
j
is the edge expansion of
M
, and
λ
2
(
M
)
is the second
largest eigenvalue of
M
. We study the relationship between
φ
(
A
)
and the spectral gap 1
−
Re
λ
2
(
A
)
for
any
doubly stochastic matrix
A
(not necessarily symmetric), where
λ
2
(
A
)
is a nontrivial eigenvalue
of
A
with maximum real part. Fiedler showed that the upper bound o
n
φ
(
A
)
is unaffected, i.e.,
φ
(
A
)
≤
p
2
·
(
1
−
Re
λ
2
(
A
))
. With regards to the lower bound on
φ
(
A
)
, there are known constructions
with
φ
(
A
)
∈
Θ
1
−
Re
λ
2
(
A
)
log
n
,
indicating that at least a mild dependence on
n
is necessary to lower bound
φ
(
A
)
.
In our first result, we provide an
exponentially
better construction of
n
×
n
doubly stochastic
matrices
A
n
, for which
φ
(
A
n
)
≤
1
−
Re
λ
2
(
A
n
)
p
n
.
In fact,
all
nontrivial eigenvalues of our matrices are 0, even though th
e matrices are highly
non-
expanding
. We further show that this bound is in the correct range (up to
the exponent of
n
), by
showing that for any doubly stochastic matrix
A
,
φ
(
A
)
≥
1
−
Re
λ
2
(
A
)
35
·
n
.
As a consequence, unlike the symmetric case, there is a (nece
ssary) loss of a factor of
n
α
for
1
2
≤
α
≤
1
in lower bounding
φ
by the spectral gap in the nonsymmetric setting.
Our second result extends these bounds to general matrices
R
with nonnegative entries, to obtain
a two-sided
gapped
refinement of the Perron-Frobenius theorem. Recall from the
Perron-Frobenius
theorem that for such
R
, there is a nonnegative eigenvalue
r
such that all eigenvalues of
R
lie within
the closed disk of radius
r
about 0. Further, if
R
is irreducible, which means
φ
(
R
)
>
0 (for suitably
defined
φ
), then
r
is positive and all other eigenvalues lie within the
open
disk, so (with eigenvalues
sorted by real part), Re
λ
2
(
R
)
<
r
. An extension of Fiedler’s result provides an upper bound an
d
our result provides the corresponding lower bound on
φ
(
R
)
in terms of
r
−
Re
λ
2
(
R
)
, obtaining a
two-sided quantitative version of the Perron-Frobenius th
eorem.
1
1 Introduction
1.1 Motivation and main result
We study the relationship between edge expansion and second
eigenvalue of nonnegative matrices. We
restrict to doubly stochastic matrices for exposition in th
e introduction, since the definitions are simpler
and it captures most of the key ideas. The extension to genera
l nonnegative matrices is treated in Section
1.2
. Let
A
be an
n
×
n
doubly stochastic matrix, equivalently interpreted as a bi
-regular weighted digraph.
The
edge expansion
of
A
, denoted as
φ
(
A
)
, is defined as
φ
(
A
)=
min
S
⊆
[
n
]
,
|
S
|≤
n
/
2
P
i
∈
S
,
j
6∈
S
A
i
,
j
|
S
|
.
The fact that
A
is doubly stochastic implies that
φ
(
A
)=
φ
(
A
T
)
.
φ
is a measure of
how much
the graph
would have to be modified for it to lose strong-connectedness
; it also lower bounds
how frequently
,
in steady state, the associated Markov chain switches betwe
en the blocks of any bi-partition; thus, it
fundamentally expresses the
extent
to which the graph is connected.
The connection between edge expansion and the second eigenv
alue has been of central importance
in the case of
symmetric
doubly stochastic matrices
M
(equivalently, reversible Markov chains with
uniform stationary distribution). For such
M
, let the eigenvalues be 1
=
λ
1
≥
λ
2
≥
...
≥
λ
n
≥−
1.
The Cheeger inequalities give two-sided bounds between the
edge expansion
φ
(
M
)
and the
spectral gap
1
−
λ
2
(
M
)
.
Theorem 1.
[
Dod84
,
AM85
,
Alo86
]
(Cheeger’s inequalities for symmetric doubly stochastic m
atrices)
Let M be a
symmetric
doubly stochastic matrix, then
1
−
λ
2
(
M
)
2
≤
φ
(
M
)
≤
Æ
2
·
(
1
−
λ
2
(
M
))
.
This is closely related to earlier versions for Riemannian m
anifolds
[
Che70
,
Bus82
]
. Notably, the
inequalities in Theorem
1
do not depend on
n
. Further, they are tight up to constants – the upper bound
on
φ
is achieved by the cycle and the lower bound by the hypercube.
The key question we address in this work is whether or to what e
xtent the Cheeger inequalities sur-
vive for
nonsymmetric
doubly stochastic matrices (the question was already asked
, for e.g., in
[
MT06
]
).
Let
A
be a doubly stochastic matrix, not necessarily symmetric. T
he eigenvalues of
A
lie in the unit disk
around the origin in the complex plane, with an eigenvalue 1 c
alled the
trivial
or
stochastic
eigenvalue
corresponding to the eigenvector
1
(the all 1’s vector), and all other eigenvalues considered
nontrivial
.
Let the eigenvalues of
A
be ordered so that 1
=
λ
1
≥
Re
λ
2
≥
...
≥
Re
λ
n
≥−
1. The spectral gap will
be defined as 1
−
Re
λ
2
(
A
)
. There are three motivations for this definition: first, the c
ontinuous-time
Markov chain based on
A
is exp
(
t
·
(
A
−
I
))
, and this spectral gap specifies the largest norm of any of the
latter’s nontrivial eigenvalues; second, Re
λ
2
(
A
)=
1 if and only if
φ
=
0 (i.e., if the matrix is reducible);
finally, this gap lower bounds the distance (in the complex pl
ane) between the trivial and any nontrivial
eigenvalue.
It was noted by Fiedler
[
Fie95
]
that the upper bound on
φ
in Cheeger’s inequality (Theorem
1
)
carries over easily to general doubly stochastic matrices,
because for
M
= (
A
+
A
T
)
/
2,
φ
(
A
) =
φ
(
M
)
and Re
λ
2
(
A
)
≤
λ
2
(
M
)
. (In fact, Fiedler made a slightly different conclusion wit
h these observations,
but they immediately give the upper bound on
φ
, see Appendix
C
for an extension and proof).
2
Lemma 2.
(Fiedler
[
Fie95
]
)
Let A be
any
n
×
n doubly stochastic matrix, then
φ
(
A
)
≤
Æ
2
·
(
1
−
Re
λ
2
(
A
))
.
It remained to investigate the other direction, i.e., the lo
wer bound on
φ
(
A
)
in terms of the spectral
gap 1
−
Re
λ
2
(
A
)
. Towards this end, we define the function
Definition 3.
Γ
(
n
)=
min
§
φ
(
A
)
1
−
Re
λ
2
(
A
)
:
A
is an
n
×
n
doubly stochastic matrix
ª
.
For symmetric doubly stochastic matrices this minimum is no
less than 1
/
2. However, for doubly
stochastic matrices
A
that are not necessarily symmetric, there are known (but per
haps not widely
known) examples demonstrating that it is impossible to have
a function
Γ
entirely independent of
n
.
These examples, discussed in Section
3.1
, show that
Γ
(
n
)
≤
1
log
n
.
(1)
One reason that
φ
and 1
−
Re
λ
2
are important is their connection to mixing time
τ
– the number of
steps after which a random walk starting at any vertex conver
ges to the uniform distribution over the
vertices. For the case of symmetric doubly stochastic matri
ces – or in general reversible Markov chains
– it is simple to show that
τ
∈
O
log
n
1
−
λ
2
, and by Cheeger’s inequality (Theorem
1
), it further gives
τ
∈
O
log
n
φ
2
where
n
is the number of vertices. For the case of general doubly stoc
hastic matrices – or
in general not-necessarily-reversible Markov chains – it s
till holds that
τ
∈
O
log
n
φ
2
by a result of Mihail
(
[
Mih89
]
, and see
[
Fil91
]
). Depending on the situation, either Re
λ
2
or
φ
may be easier to estimate.
Most often, one is interested in concisely-specified chains
on exponential-size sets, i.e., the number of
vertices is
n
but the complexity parameter is log
n
. In this case, either
φ
or
λ
2
(for the reversible case)
can be used to estimate
τ
. However, if the matrix or chain is given explicitly, then on
e reason the
Cheeger inequalites are useful is because it is simpler to es
timate
τ
using
λ
2
which is computable in P
while computing
φ
is NP-hard
[
GJS74
]
.
From the point of view of applications to Markov Chain Monte C
arlo (MCMC) algorithms, a log
n
loss
in the relation between
φ
and 1
−
Re
λ
2
as implied by (
1
), is not a large factor. For MCMC algorithms,
since “
n
” is the size of the state space being sampled or counted and th
e underlying complexity parameter
is log
n
, if it were true that
Γ
(
n
)
≥
log
−
c
n
for some constant
c
, then the loss in mixing time estimates
would be polynomial in the complexity parameter, and thus, t
he quantity 1
−
Re
λ
2
could still be used
to obtain a reasonable estimate of the mixing time even in the
case of nonsymmetric doubly stochastic
matrices.
However, the truth is much different. Our main result is that
Γ
(
n
)
does not scale as log
−
c
n
, but is
exponentially
smaller.
Theorem 4.
(Bounds on
Γ
)
1
35
·
n
≤
Γ
(
n
)
≤
1
p
n
.
We give an
explicit
construction of doubly stochastic matrices
A
n
for the upper bound on
Γ
. This con-
struction of highly nonexpanding doubly stochastic matric
es has, in addition, the surprising property
that
every
nontrivial eigenvalue is 0. Thus, for non-reversible Marko
v chains, the connection between
φ
3
and Re
λ
2
breaks down substantially, in that the upper bound on mixing
time obtained by lower bound-
ing
φ
by the spectral gap 1
−
Re
λ
2
can be be exponentially weaker (when the complexity paramet
er is
log
n
) than the actual mixing time, whereas for reversible chains
there is at worst a quadratic loss.
This theorem has a very natural extension to general nonnega
tive matrices, as we next describe.
1.2 General nonnegative matrices and a two-sided quantitati
ve refinement of the Perron-
Frobenius theorem
We extend our results to general nonnegative matrices
R
. By the Perron-Frobenius theorem (see The-
orem
8
), since
R
is nonnegative, it has a nonnegative eigenvalue
r
(called the PF eigenvalue) that is
also largest in magnitude amongst all eigenvalues, and the c
orresponding left and right eigenvectors
u
and
v
have all nonnegative entries. Further, if
R
is irreducible, i.e., the underlying weighted digraph
on edges with positive weight is strongly connected (for eve
ry
(
i
,
j
)
there is a
k
such that
R
k
(
i
,
j
)
>
0),
then
r
is a simple eigenvalue, and
u
and
v
have all positive entries. We henceforth assume that nonzer
o
nonnegative
R
has been scaled (to
1
r
R
) so that
r
=
1. Thus we can again write the eigenvalues of
R
as
1
=
λ
1
(
R
)
≥
Re
λ
2
(
R
)
≥···≥
Re
λ
n
(
R
)
≥−
1. Scaling changes the spectrum but not the edge expansion
(which we define next), so this canonical scaling is necessar
y before the two quantities can be compared.
Defining the edge expansion of
R
is slightly delicate so we explain the reasoning after Defini
tion
11
, and present only the final definition (see Definition
11
) here. Consider
u
and
v
as left and right
eigenvectors for eigenvalue 1 of
R
. If
every
such pair
(
u
,
v
)
for
R
has some
i
such that
u
(
i
) =
0 or
v
(
i
)=
0, then define
φ
(
R
)=
0. Otherwise, let
u
and
v
be some positive eigenvectors for eigenvalue 1
of
R
, normalized so that
〈
u
,
v
〉
=
1, and define the edge expansion of
R
as
φ
(
R
)=
min
S
⊆
[
n
]
,
P
i
∈
S
u
i
·
v
i
≤
1
2
P
i
∈
S
,
j
∈
S
R
i
,
j
·
u
i
·
v
j
P
i
∈
S
u
i
·
v
i
.
Given this definition of
φ
(
R
)
, we show the following.
Theorem 5.
Let R be a nonnegative matrix with PF eigenvalue
1
, and let u and v be any corresponding left
and right eigenvectors. Let
κ
=
min
i
u
i
·
v
i
, and if
κ>
0
, let u and v be normalized so that
〈
u
,
v
〉
=
1
. Then
1
30
·
1
−
Re
λ
2
(
R
)
n
+
ln