of 15
Diverse engineered heme proteins enable stereodivergent
cyclopropanation
of unactivated alkenes
Anders M. Knight
1
, S. B. Jennifer Kan
2
, Russell D. Lewis
1
, Oliver F. Brandenberg
2
,
Kai Chen
2
,
Frances H. Arnold
1,2
*
1
Division of Biology and Bioengineering and
2
Division of Chemistry and Chemical Engineering,
California Institute of Technology,
1200 East California Boulevard, MC 210
-
41, Pasadena, CA 91125, United States
*Corresponding author: frances@cheme.caltech.edu
Key Words:
Stereodivergence
,
Biocatalysis, Carbene Transfer, Heme Protein,
Cyclopropanation, Directed Evolution
Abstract
Stereodivergent synthes
e
s
leading to
the different
stereoisomer
s
of a
product
are
useful
in the
discovery and
testing
of drugs and a
grochemicals
.
A longstanding challenging in catalysis,
d
eveloping sets of stereodivergent catalysts
is often solved for
enzymes
by screening
Nature’s
diversity for
biocatalysts
with complementary stereoselectivit
ies
.
Here,
Nature’s
protein diversity
has
been
leveraged to develop stereodivergent
catalysts
for
a
reaction not
known
in
biol
o
gy
,
cyclopropanation via carbene transfer
.
By screening diverse nat
ive
and engineered heme
proteins, we identified globins and serine
-
ligated cytochromes P450 with promisc
uous
activity
for
cyclopropanation
of
unactivated alkene
substrates. The
ir
activit
ies
and stereoselectivit
ies
were enhanced
by directed evolution:
1
-
3
rounds of
site
-
saturation
mutagenesis and screening
generated enzymes that
catalyze the stereodivergent
cyclopropanation
to form
each of the four
stereoisomers
of unactivated alkenes
and electron
-
deficient alkenes with
up to 5,400
total
turnovers
and 98%
enantiomeric excess
. The
se
fully
genetically
encoded
biocatalysts
functio
n
in whole
E. coli
cells in mild
, aqueous conditions
and
provide
the
first
example
of
enantioselective, intermolecular iron
-
catalyzed cyclopropanation of unactivated alkenes
via
carbene transfer
.
Introduction
The biological world
is
a marvelous ensemble of chiral molecules
.
F
rom
the
amino acid and
nucleoside building blocks
that
form proteins and DNA to
intricate
natural products
produced by
living organisms
,
chirality dictates how molecules
interact with
living systems
.
1
Many
modern
medicines
draw inspiration
from
natural product
s
.
2
Because
alterna
te stereoisomers can have
very
different biological effects
,
3
characterization of
novel
bioactive
compound
s
during
drug
candidate
screening
should include testing each
stereoisomer
.
4
D
eveloping stereodivergent
syntheses, where a set of
complementary catalysts can generate e
very
possible
stereo
isomer of
the product
,
is
therefore
useful
and
is actively
sought after in catalysis
.
5
Enzymes are
green,
sustainable
option
s
for
stereo
selective catalysis
, and
stereo
-
complementary enzymes
can often
be found in
nature’s diversity
:
lipases
6
, ketoreductases
7
, and transaminases
8
chosen
using
genome mining
9
, for example, have
all
afford
ed
products with
different
stereo
selectivities
.
We
show here
that
natural protein diversity
can be
leveraged
in a similar fashion
to achieve
stereodivergence
for a new,
non
-
natural
enzyme
-
catalyzed
reaction
, cyclopropanation
of
unactivated alkenes
.
P
revious work from
th
is
group
and others has shown that
iron
-
porphyrin (heme)
proteins can be
engineered to cataly
z
e
the
cyclopropanation of
styrenyl alkenes
with ethyl diazoacetate (EDA
,
1
)
.
10
-
13
This
new
-
to
-
nature
carbene transfer
reaction
has been applied in the synthesis of
key
pharmaceutical
intermediates
such as
levomilnacipran
14
,
ticagrelor
15
-
16
, and tasimelteon
.
16
Thus
far, however, alkene cyclopropanation
by heme proteins with the native iron cofactor
has been
limited to styrenyl and other activated alkenes.
Unactivated, aliphatic alkenes are
attractive
feedstocks
for
chemical synthesis,
but
their
transformatio
n
to high
er
-
value chiral products
is
challenging due to their inert nature, high degree of conformational flexibility, and limited steric
and electronic bias to guide stereocontrol
.
17
S
tate
-
of
-
the art methods for unactivated alkene cyclopropanation
often
rely on noble metals
18
-
20
(Supplemental
Table S1)
;
no iron
-
based catalyst for
the enantioselective
intermolecular
cyclopropanation of unactivated alkenes
has been report
ed.
However, d
irected evolution of
heme
proteins has previously enabled biocataly
sis t
o
access reactions
performed with noble
-
metal
catalysts, such as
carbon
silicon bond formation
21
and intermolecular C
H amination
22
.
W
e
therefore
set out to
create
a genetically
encoded catalyst
with the native
heme cofactor
that
could
cyclopropanate
unactivated alkenes
. Furthermore, we wished to
take advantage of
the
natural
diversity of
heme
protein
s
to
identify suitable starting points for
engineering
stereodivergent
biocatalysts
.
Results and Discussion
W
e collected a
panel of
eleven
heme
proteins
from thermophilic and hyperthermophilic bacteria
and archaea
to test for
unactivated alkene cyclopropanation
(
Supplemental Table
S2
)
.
Thermostable p
roteins can better withstand the destabilizing effect
s
of
mutations
and are
therefore more ‘evolvab
le’
.
23
They are also
often
easier to work with and
better
tolerate polar
organic solvents used to solubilize substrates.
W
ild
-
type
Aeropyrum pernix
protoglobin (ApePgb
WT, UniProt ID: Q9YFF4) and wild
-
type
Rhodothermus marinus
nitric oxide dioxygenase
(Rma
NOD WT, UniProt ID: D0MGT2) were found to have
low but measurable
cyclopropanation
activity
on
1
-
octene
2a
, catalyzing the reaction with
18
and
27
total
turnovers per enzyme
active
site
(TTN)
.
Notably, ApePgb WT and RmaNOD WT displayed complementary
diastereoselectivity
, preferentially producing
cis
(
1R
,
2S
)
-
3a
and
trans
(
1S
,
2S
)
-
3a
, respectively.
In addition to search
ing
natural heme protein diversity
for this novel reactivity
, we
also
investigated
heme
proteins
obtained
in
previous directed evolution
studies
. A
panel of
36
variants of a
Bacillus
megaterium
cytochrome P450
(
BM3
)
engineered
for other non
-
natural
carbene and nitrene
transfer reactions
13
w
as
tested for the ability to cyclopropanate
1
-
octene
2a
and
4
-
phenyl
-
1
-
butene
2b
.
2b
was chosen as a substrate for library screening,
because
the UV
-
visible phenyl
group
enables screening by
HPLC
-
UV
.
BM3 variant P411
-
CIS L437F T438Q L75Y L181I (P411
-
UA
, sequence in Supporting Information
)
showed
significant activity and selec
tivity
for
production
of
cis
(
1S
,
2R
)
-
3a
, the third of
the
four possible isomers. This
variant of a serine
-
ligated “
P411
(P411
-
CIS
24
)
had been
engineered for
cyclopropanation r
eactivity on
electron
-
rich
,
non
-
styrenyl
alkene
s such as
N
-
vinyl amides
(
Brandenberg
et al.
,
unpublished results
)
.
S
ite
-
saturation mutagenesis libraries were generated
and screened
t
o increase
the
activit
ies
and
selectivit
ies
of the different enzymes
.
Because
crystal
structures of
ApePgb and RmaNOD
have
not been reported
,
homology models were built
to help us
identify
residues within the putative
distal heme pocket, where carbenoid formation and substrate binding are predicted to take place
(
Supplemental
Figure S
2
).
P411
-
UA residues were selected based on
the
crystal structure of its
P411
-
CIS
predecessor
(PDB ID: 4H23)
.
Individual
site
-
saturation libraries were screened for
increased
activity and diastereoselectivity using 4
-
phenyl
-
1
-
butene
(
2b
)
and EDA
1
as
substrates
.
Variants with enhanced
diastereo
selectivity
in the production of
3b
were regrown in larger scale
,
and their activit
ies
were te
sted in cyclopropanation
of
4
-
phenyl
-
1
-
butene
(
2b
)
and 1
-
octene
(
2a
)
with EDA
.
Enzyme
variants with the greatest
overall
selectivity
enhancements
against
3
a
and
3b
were used
as
parent
s in
the
next
round
s
of site
-
saturation muta
g
enesis
and screening
.
A single
mutation
(Q52V)
gave RmaNOD near
-
perfect
stereoselectivity for
producing
trans
(
1S
,
2S
)
-
3a
.
T
hree mutations (W59A Y60G F145W, or “AGW”)
gave ApePgb
the ability
to
make
cis
(
1R
,
2S
)
-
3a
with
89:11
diastereomeric ratio (
d.r.
)
and
99
%
enantiomeric excess
(
e.e.
)
.
During
screening
to
increase
P411
-
UA’s
cis
diastereoselectivity, a single mutation, V87F, was found to completely
invert the diastereoselectivity
from
89:11
cis
(
1
S
,
2
R
)
-
3a
to
4
:9
6
trans
(
1R
,
2
R
)
-
3a
, affording the
fourth and
final
stereoisomer we need
ed
.
Residue 87
is
known to modulate
the
stereoselectivity
of P450 BM3
for
oxygenation of
various
substrates.
25
With initial screening of
11 new and 36 previously engineered
proteins
, followed by
just
one
to
three rounds of
site
-
saturation mutagenesis
,
we discovered four protein variants
capable of
cyclopropanating unactivated alkenes
(
RmaNOD Q52V, ApePgb W59A Y60G F145W (
=
ApePgb
AGW), P411
-
UA
-
V87C, and P411
-
UA
-
V87F
),
each of which
produce
d
a
di
stinct
stereoisomer of
the
desired
product
3a
with 89:11
to <1:99
d.r.
and 96
to >99
%
e.e.
(Figure
1
)
.
The enzyme
activities against unactivated alkenes are comparable to the state
-
of
-
the
-
art
catalysts, with 100
-
490 TTN
for
3a
and as high as
2,400 TTN for
3b
, the
substrate against which the enzymes were
screened.
T
he system is straightforward and easy to use
:
t
he
protein
-
expressing
bacterial
cells
need only be resuspended to the desired
concentration and the alkene and diazo ester added
directly
under an anaerobic atmosphere
.
When the reaction is complete
, the
product is extracted
into organic solvents for analysis
or purification
.
While these enzymes were optimized for use in
whole cells
,
the
y also
function
to some degree
in lysate
s
and as purified protein
s
(Supporting
Information)
.
T
he
four
engineered
bio
catalysts
were tested
on
a range of alkenes.
Their activit
ies
and
selectivit
ies
were high
on
unbranched aliphatic alkenes
similar to
those for which
they were
engineered, but their substrate scope extends to
sterically hindered and electron
-
deficient alkenes
as well
(Figure
2
).
Though
activity and stereoselectivity differed
on different
substrates,
each
catalyst accepted
most
of the
substrate
s
tested
.
It is likely
that
activity on
specific
substrate
s
can
be
optimized
further
,
if desired
, as has been shown in many
other
directed evolution studies
.
26,27
The
small
-
molecule catalyzed
enantioselective preparation of cyclopropyl esters from electron
-
deficient alkenes has
previously
been limited to
making the
trans
-
cyclopropanes
,
28
whereas
strategies to
directly
access
1
-
keto
,
2
-
ester
or 1,2
-
diester
cis
-
cyclopropanes
(or their
correspondin
g carboxylates)
via enantioselective cyclopropanation
are
unknown.
The
biocatalysts, in contrast, enable access to the
cis
-
1
-
keto,2
-
ester and
cis
-
1,2
-
diester products in a
single, intermolecular step using an
E. coli
based platform (
(
1
R
,
2
S
)
-
and (
1S
,
2R
)
-
3c
,
(
1
R
,
2
S
)
-
3g
, Figure 2)
.
Some of these
products
are precursors to valuable compounds
: c
yclopropyl esters
of unbranched, aliphatic alkenes are used in fragrances,
for example
, including the essential
odorants in frankincense
.
29
Notably, the enzymes catalyze the reaction on
2
-
v
inylpyridine (
2
h
)
,
which is
a difficult substrate for many catalysts due to pyridine’s propensity to coordinate to and
inhibit metal centers
. This cyclopropanation product
is a precursor for an orphan GPR88
agonist
.
30
E
nzymes
are chemoselective and
can
generate
desired produ
ct
s
without
additional
steps to
protect
and deprotect
other reactive functional groups on the same molecule
.
As shown in Figure
3, t
he enzymes
described here
,
for example,
can
selectively
cyclopropanat
e
terminal alkenes in
the presence of
alcohol and carboxylic acid
functional groups
which
often
undergo competitive
O
H insertion reactions
with
small
-
molecule carbene transfer catalysts like
rhodium acetate
dimer
.
31
ApePgb
AGW
performed particularly well with
unprotected
7
-
octen
-
1
-
ol
(
2
i
)
and
7
-
octen
-
1
-
oic acid
(
2
j
)
, yielding
products
(
1
R
,
2
S
)
-
3
i
and
(
1
R
,
2
S
)
-
3
j
at
77% and 6
4
% isolated yield
,
respectively,
in preparative
-
scale reactions
.
Some functional groups
cannot be
protected
easily
,
and chemo
-
and regioselectivity is even more important in these cases
.
In the cases of
1,3
-
(
E
)
-
pentadiene
(
2k
)
and
1,3
-
(
Z
)
-
pentadiene
(
2l
), all
four
engineered
proteins
cyclopropanate the
terminal alkene
with perfect regioselectivity
, lik
ely due to the steric constraints in each enzyme’s
active site
that
direct catalysis to the more accessible double bond
.
T
he diastereoselectivit
y
varie
d
for
3k
and
3
l
, though the enantioselectivity for the major isomer remained high. As the electronic
properties of
2k
and
2
l
are
similar
, the
difference in
stereo
selectivity
likely reflects
steric
constraints
of
the
enzyme
active site
s
.
C
iting the need for a greater
reactivity of the metal center to cyclopropanate unactivated alkenes
,
Hartwig
,
Clark
,
and coworkers
showed that heme proteins could
bind an artificial iridium cofactor
in place of iron heme
and
perform carbene transfer chemistry
.
19
They
sho
wed
that a prote
in’s
active site can
confer selectivity to noble
-
metal, small
-
molecule catalysts
that
can already cataly
ze
the
reaction
.
19,20,32
U
se of an artificial iridium cofactor (Ir(Me)PIX) required the lysis, purification,
and
in vitro
metalation of the
apoprotein with the Ir(Me)PIX
, all of which add time and cost to
catalyst
preparation
.
Though it may be possible
to perform these meta
l
ations
in vivo
,
33
the
synthe
tic,
noble
-
metal catalyst
is
far
more
expens
ive than the native heme cofactor, which is
manuf
actured by the cell and loaded into the catalyst
during protein expression
in vivo
.
The use
of iridium is also not ideal due to the negative impact mining and refining precious metals has on
the
environment
.
34
A
noble
-
metal catalyst is not necessary, however
, for these reactions. T
wo decades ago,
Woo
and coworkers
showed
that
iron
meso
-
tetrakis(pentafluorophenyl
)porphyrin chloride (Fe(PFP)Cl)
can catalyze the reaction of 2
-
ethyl
-
1
-
butene and EDA with 390 TTN
; the
y
reported the formation
of
cyclopropane products
using 1
-
decene as well
.
35
In fact,
we observed that
iron heme
in
aqueous buffer
, with no protein,
can catalyze
the formation of
3a
, albeit
with only 0.4 TTN
.
This
basal activity is greatly enhanced and
stereoselectivity is enforced
by the
protein
environment,
allowing
the heme proteins described here to
cyclopropana
te
a range of
alkenes from
electron
-
rich conjugated dienes
to
electron
-
deficient vinyl ketones and acrylates
with high diastereo
-
and
enantio
selectivity
.
T
he primary factor in
determining
activity appears to be the binding of the
alkene
in a productive configuration
: t
he
heme’s local
protein
environment
can be molded to
enhance activity and selectivity by optimizing the substrate binding modes
.
Different
local
heme
environment
s
can
be
accessed by
screening natural and engineered protein diversity.
Directed
evolution
then
fine
-
tunes
these features.
M
etalloporphyrin
catalyst
s
have been used
in synthetic chemistry for decades,
but
nature has
used
them for mil
lions of years. P
resent in all forms of life on Earth
, heme
-
binding proteins have
diverse functions as well as promiscuous activities for which they were never selected, such as
the ability to form reactive carbene intermediates. We have taken
advantage of this natural
diversity to find catalysts for reactions not known to be catalyzed in biology, but that are
synthetically useful and are driven by a synthetic carbene precursor (EDA).
While
biocatalys
ts
often possess
very high selectivity,
this selectivity can be synth
etically limiting.
A single e
nzyme
may
make
only a single
isomer
,
but
access to
other isomers may be equally
important
.
N
atura
l diversity can be leve
raged
effectively
for this challenge
.
A
combination of
natural diversity and
directed evolution let us realize the stereodivergent cyclopropanation of
unactivated and electron
-
deficient alkenes in mild, aqueous conditions w
ith a fully genetically
encoded
heme protein
expressed in bacteria
.
This set of
biocatalysts
can serve as star
ting points
for green, sustainable
synthesis of valuable
cyclopropanat
ed products
.
Supporting
Information
The materials and experimental methods, detailed protein engineering strategies for each variant,
and compound characterization are available
in the
Supporting Information
.
Acknowledgments
This work was supported by the National Science Foundation Division of Molecular and Cellular
Biosciences (grant MCB
-
1513007) and the Office of Chemical, Bioengineering, Environmental
and Transport Systems SusChEM
Initiative (grant CBET
-
1403077). The authors thank Dr. Nathan
Dalleska, Aurapat Ngamnithiporn, and Dr. Scott C. Virgil for analytical chiral GC support, and Dr.
Stephan C. Hammer and Dr. Xiongyi Huang for helpful discussions and critical reading of the
ma
nuscript. A.M.K. gratefully acknowledges support from Caltech’s Center for Environmental
Microbial Interactions and the NSF Graduate Research Fellowship (Grant No. 1144469). R.D.L.
is supported by an NIH
National Research Service Award training grant (5 T3
2 GM07616). O.F.B.
acknowledges support from the Deutsche Forschungsgemeinschaft (Grant No. BR 5238/1
-
1)
and
the Swiss National Science Foundation (Grant No. P300PA
-
171225)
. A provisional patent
application has been filed through the California Institute o
f Technology based on the results
presented here.
Table of
c
ontents graphic.
Figure
1
. Stereoselective enzymatic cy
c
lopropanation of
the
aliphatic alkene
1
-
octene
2a
and
EDA
1
to
obtain each of four
stereoisomers of cyclopropane
product
3a
with
diastereoselectivies
from
89:11
to <99:1
d.r. and
enantioselectivies from 96% to >99%
e.e.
.
Reaction conditions:
w
hole
E. coli
cells in M9
-
N
buffer
, 25 mM glucose,
10
mM 1
-
octene
2a
,
direct addition of 2
0 mM EDA
1
under anaerobic conditions
, 5%
ethanol
cosolvent
.
Catalysts used: rhodium acetate dimer
(Rh
2
(
O
Ac)
4
)
to form the racemic authentic standard, two variants of the engineered, serine
-
ligated
cytochrome P450
-
BM3 (P411
-
UA V87C
and P411
-
UA V87F),
Aeropyrum pernix
protoglobin
W59A Y60G F145W (ApePgb AGW), and
Rhodothermus marinus
nitric oxide dioxygenase Q52V
(RmaNOD Q52V). Protein sequences
are
available in the
Supporting
Information.