of 10
Exploring
PROTAC
Cooperativity
with
Coarse-Grained
Alchemical
Methods
Huanghao
Mai,
*
Matthew
H. Zimmer,
and Thomas
F.MillerIII
Cite
This:
J. Phys.
Chem.
B
2023,
127,
446−455
Read
Online
ACCESS
Metrics
& More
Article
Recommendations
*
Supporting
Information
ABSTRACT:
Proteolysis
targeting
chimera
(PROTAC)
is a novel
drug
modality
that facilitates
the degradation
of a target
protein
by inducing
proximity
with
an E3 ligase.
In this work,
we present
a new
computational
framework
to
model
the cooperativity
between
PROTAC
E3
binding
and PROTAC
target
binding
principally
through
protein
protein
interactions
(PPIs)
induced
by the
PROTAC.
Due to the scarcity
and low resolution
of experimental
measurements,
the physical
and
chemical
drivers
of these
non-native
PPIs
remain
to be
elucidated.
We develop
a coarse-grained
(CG)
approach
to model
interactions
in
the target
PROTAC
E3
complexes,
which
enables
converged
thermodynamic
estimations
using
alchemical
free
energy
calculation
methods
despite
an
unconventional
scale
of perturbations.
With
minimal
parametrization,
we
successfully
capture
fundamental
principles
of cooperativity,
including
the
optimality
of intermediate
PROTAC
linker
lengths
that
originates
from
configurational
entropy.
We qualitatively
characterize
the dependency
of cooperativity
on PROTAC
linker
lengths
and protein
charges
and shapes.
Minimal
inclusion
of sequence-
and conformation-specific
features
in our current
force
field,
however,
limits
quantitative
modeling
to reproduce
experimental
measurements,
but further
development
of the CG model
may allow
for efficient
computational
screening
to optimize
PROTAC
cooperativity.
INTRODUCTION
Proteolysis
targeting
chimera
(PROTAC)
has emerged
as a
promising
drug
modality
that
elicits
protein
degradation
by
hijacking
the ubiquitin
proteasome
system
(UPS),
a major
regulatory
component
of cells.
In the UPS
pathway,
E3 ligases
transfer
ubiquitins
onto
aberrant
proteins
to mark
them
for
degradation
by proteasomes.
A PROTAC
molecule
exploits
this pathway
with
two binding
moieties
that tether
the target
protein
and an E3 ligase
together.
The tethered
target
protein
thus
becomes
a neo-substrate
of the
E3 ligase
and
is
subsequently
ubiquitinated
for proteasomal
degradation.
PROTACs
require
a lower
dose
than
conventional
small-
molecule
inhibitors
because
of their
catalytic
nature
and they
have
the potential
to target
the undruggable
proteome.
1,2
Since
the first
proof-of-concept
in 2001,
3
the number
of proteins
successfully
degraded
by PROTACs
has grown
rapidly,
and
examples
of such
proteins
include
kinases
and gene
regulators
that are implicated
in cancer.
As of 2021,
at least
13 PROTACs
are in or approaching
clinical
trials.
4
Despite
increasing
applications,
there
is a lack of guidance
on designing
PROTACs
due to the unique
mode
of action.
5
7
In particular,
a critical
step
in the degradation
process
is the
formation
of the ternary
complex
of target
PROTAC
E3.
The
ternary
complex
involves
molecular
interactions
beyond
the binary
bindings
between
the two warheads
of a PROTAC
and the two proteins.
The selectivity
8
10
and stability
11
14
of
the ternary
complex
can both
be improved
through
favorable
protein
protein
interactions
(PPIs)
between
the target
protein
and the E3 ligase.
For certain
targets,
the degradation
outcome
can be very different
depending
on whether
cereblon
(CRBN)
or von Hippel
Lindau
(VHL),
the two most
heavily
used
E3
ligases,
more
efficiently
and
selectively
form
a productive
complex
with
the target.
11,15
17
As more
warheads
for E3
ligases
are designed,
18
21
choosing
which
of the more
than
600
E3 ligases
in humans
22
optimally
interact
with
the target
protein
will become
important.
23,24
While
PPIs
depend
on the
sequences
and the structures
of the proteins,
PROTACs
can
also modulate
the PPIs
by restricting
the distance
and relative
orientation
between
the target
and the E3 ligase,
effectively
changing
the entropic
component
of PPIs.
Because
of this three-body
interplay
and the transient
nature
of the ternary
complex,
a complete
characterization
of the PPIs
as a function
of the PROTAC,
the target
protein,
and the E3
ligase
is intractable.
A few proteomics
studies
16,17,25
on kinase
degradation
have
used
PROTACs
with
promiscuous
warheads
such
that the PROTAC-induced
PPIs
differentially
affect
the
Received:
August
12, 2022
Revised:
December
18, 2022
Published:
January
6,
2023
Article
pubs.acs.org/JPCB
© 2023
The Authors.
Published
by
American
Chemical
Society
446
https://doi.org/10.1021/acs.jpcb.2c05795
J. Phys.
Chem.
B
2023,
127,
446
455
Downloaded via CALIFORNIA INST OF TECHNOLOGY on March 17, 2023 at 15:19:06 (UTC).
See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.
degradation
outcome
of hundreds
of proteins.
These
studies
reported
the fold
change
of protein
abundance
due
to
PROTAC
treatment,
but analysis
can
be complicated
by
secondary
interactions
24
and numerous
other
factors
such
as
the permeability
of the PROTAC,
half-lives
of the target
proteins,
cellular
localization,
and
reactions
downstream
of
ternary
complex
formation.
26
Other
studies
8,9,27
30
have
focused
on specific
target-E3
pairs
and examined
the effect
of changing
PROTAC
properties
such
as the linker
length.
They
measured
the difference
in the strength
of PROTACs
binding
to the target
or the E3 ligase
due to the presence
of the
other
protein.
This
difference,
termed
binding
cooperativity,
reflects
the strength
of PROTAC-mediated
PPIs.
However,
few
generalizable
patterns
have
emerged
and
systematic
exper-
imental
characterizations
remain
scarce.
Computational
modeling
based
on docking
or atomistic
molecular
dynamics
(MD)
has complemented
experimental
work
9,29
and displayed
promising
future
prospects,
but there
are several
limitations
to current
methodologies.
Although
standard
docking
protocols
do not
handle
three-body
problems,
several
workflows
have
been
adapted
ad hoc for
PROTAC.
31
35
Docking
studies
rank
ternary
complex
conformations
by scoring
functions
biased
for naturally
evolved
PPIs
and benchmark
against
the few crystal
structures
of PROTAC-induced
ternary
complexes.
36
38
The results
can
be inaccurate
as PROTAC-induced
PPIs
are non-native
and
exhibit
plasticity.
9,39
In contrast,
atomistic
MD
is physically
grounded
to capture
non-native
PPIs.
However,
the size of the
ternary
complex
modeled
at an atomistic
resolution
signifi-
cantly
limits
the time
scale
of simulations,
such
that
naively
simulating
PPIs
can
be prohibitively
slow.
Sophisticated
enhanced
sampling
techniques
and distributed
computing
are
needed
to sample
an ensemble
of low-energy
conformations
that
are consistent
with
experimental
data.
40
Due
to the
difficulties
in modeling
the ternary
complex,
direct
calculation
of the binding
cooperativities
was not attempted
until
two
recent
studies
41,42
that explored
the molecular
mechanics
with
the generalized
Born
and surface
area
continuum
solvation
(MM/GBSA).
Here,
we seek an orthogonal
approach
that combines
coarse-
grained
MD
(CGMD)
and alchemical
free energy
calculation
methods
to study
PROTAC
cooperativities.
On the spectrum
of computational
tools,
docking
and
atomistic
MD
are
positioned
at the empirical
and first-principle
ends,
respec-
tively,
and finding
a compromise
in the middle
of this spectrum
is a promising
direction.
Compared
to atomistic
modeling,
coarse-graining
reduces
the effective
size of the model
and
smoothens
the energy
surface,
enabling
simulations
at a much
longer
time
scale
necessary
for the
PROTAC-mediated
complexes.
While
CGMD
may
struggle
to recapitulate
the
molecular
basis
of lock-and-key
bindings,
such
strong
and
specific
interactions
are less imperative
in non-native
PPIs
induced
by PROTACs.
Moreover,
PROTAC
binding
reduces
the ways
proteins
can interact
with
each
other,
differentiating
and simplifying
the problem
studied
here
from
the formidable
task of modeling
general
protein
protein
binding.
In docking,
such
constraints
are challenging
to incorporate
into the scoring
functions
and are approximated
through
separate
steps
to filter
compatible
PPI
poses
and
PROTAC
geometries.
While
CGMD
excludes
many
degrees
of freedom
from
the PROTAC,
proteins,
and
solvent
entropy,
this effect
of configurational
entropy
on PPIs
from
PROTAC
mediation
can be directly
captured.
Finally,
we calculate
binding
energies
using
alchemical
methods,
which
circumvents
the computational
challenge
of directly
sampling
binding
and unbinding
events
between
the PROTAC
and
proteins.
We demonstrate
the
computational
amenity
of an unconventional
application
of
alchemical
methods
motivated
by the PROTAC
systems,
and
take advantage
of the physical
interpretability
of the CGMD
+
alchemical
approach
to explore
the principles
of PROTAC
binding
cooperativity.
METHODS
CGMD
Setup
of
PROTAC
Protein
Complexes.
The
binary
and ternary
PROTAC-protein
complexes
are coarse-
grained
at two
resolutions
to efficiently
sample
complex
conformational
changes
while
retaining
sufficient
details
for
structural
insight.
Specifically,
a major
focus
of this work
is to
Figure
1.
Schematic
of the simulation
setup
for PROTAC-mediated
complexes.
(a) Target
PROTAC
E3
ternary
complex
is initialized
with
a
fully
extended
PROTAC
as drawn.
The
proteins
are coarse-grained
at the resolution
of three
amino
acids
per bead,
approximately
0.8 nm.
PROTAC
warhead
beads
are represented
by beads
of the same
size, whereas
the linker
is coarse-grained
at a higher
resolution.(b)
PPIs
affect
how
cooperative
target
PROTAC
and PROTAC
E3
bindings
are and are reflected
in the free energy
difference
between
PROTAC
E3
binding
with
and without
the target
G
G
(
)
EP
EP
binary
ter
nary
. This
free energy
difference,
ΔΔ
G
,
can also be obtained
by comparing
target
PROTAC
binding
with
and without
the E3
G
G
(
)
TP
TP
binary
ter
nary
as shown
by the thermodynamic
cycle.
Under
the alchemical
setup,
ΔΔ
G
can be alternatively
obtained
by the free energy
difference
between
the red vertical
processes,
which
represent
coupling
the target
(
_
_
G
G
T
T
couple
binary
couple
ternary
in (c))
or the E3 (
_
_
G
G
E
E
couple
binary
couple
ternary
in (d))
to the PROTAC
and the PROTAC
prebound
to the other
protein.
In the initial
states
in (c) and (d), the dotted
lines
represent
the target
or the E3 whose
interactions
with
the rest of the system
are turned
off except
for the harmonic
constraints
(black
lines)
to
the PROTAC
warhead.
The
Journal
of
Physical
Chemistry
B
pubs.acs.org/JPCB
Article
https://doi.org/10.1021/acs.jpcb.2c05795
J. Phys.
Chem.
B
2023,
127,
446
455
447
characterize
the entropic
effect
of the length
of PROTACs
on
the strength
of induced
PPIs,
necessitating
modeling
the
PROTAC
linker
at a higher
resolution
than
the rest of the
system.
Proteins
are coarse-grained
by mapping
every
three
amino
acids
onto
a large
bead
of
σ
= 0.8 nm diameter,
which
is
approximately
the Kuhn
length
of polypeptides.
43
46
Binding
moieties
at the two ends
of a PROTAC
are each
represented
by a large
bead,
whereas
the linker
region
is modeled
as a
Gaussian
chain
at the resolution
of a PEG
unit
(
σ
s
= 0.35
nm)
47
or three
heavy
atoms.
Several
experimental
works
that
used
flexible
linear
linkers
motivate
our modeling
approach
for
the PROTAC
linker,
including
Chan
et al.
28
where
an alkane
linker
was varied
in step sizes
of our linker
beads
and Zorba
et
al.
29
where
a PEG
linker
is modified
at smaller
length
steps
such
that linker
lengths
ranging
from
1 to 6
σ
s
in our modeling
correspond
to the PROTAC
(1), (3), (5), (6), (8), and (10).
A minimal
force
field
is used
to describe
the internal
and
interactive
forces,
and a full description
can be found
in the
Supporting
Information
(Section
S1).
The
three-dimensional
structure
of a protein
is maintained
by a bottom-up
fitted
elastic
network
model
(Figure
S2),
which
allows
conforma-
tional
flexibility.
48,49
Protein
beads
can
have
additional
properties
to describe
PPIs
beyond
volume
exclusion
(Figure
S1).
When
modeling
electrostatic
interactions,
for example,
a
protein
bead
has the net charges
of the triplet
of residues
that it
is coarse-grained
from.
PROTACs
are modeled
as Gaussian
polymers
with
volume
exclusion,
and the warhead
beads
are
attached
to the binding
pockets
of proteins
through
harmonic
springs.
Modeling
PROTAC
interactions
beyond
warhead
binding
is out of the scope
of this work.
Thus,
under
current
setup,
PROTAC
beads
have
0 charge
and no affinity
to any
other
beads.
The orientation
between
the E3 ligase
and the target
protein
is initialized
such
that the two binding
pockets
face each
other,
with
a fully
extended
PROTAC
tethering
in between
(Figure
1a). The binding
moiety
beads
of PROTAC
are placed
at the
center
of each
binding
pocket,
which
is defined
by the residues
within
4 or 5 Å from
the PROTAC
warhead
in experimental
structures.
Thus,
setting
up the initial
coordinates
of a ternary
complex
requires
the following
inputs:
structures
of each
protein,
residues
at the two PROTAC
binding
pockets,
and the
length
of the PROTAC
linker.
To calculate
the difference
in
PROTAC
binding
energies
due to PPIs,
simulations
of binary
target/E3-PROTAC
complexes
are also
needed.
Binary
complexes
are prepared
by removing
a protein
from
the
initialized
ternary
complex.
Thermodynamic
Framework
of
Alchemical
Perturba-
tion.
The binding
cooperativity
of a PROTAC
is mathemati-
cally
defined
as exp(
ΔΔ
G
/
RT
),
where
R
is the gas constant,
T
here
refers
to the temperature
in the context
of an energetic
scale
and
refers
to the target
protein
elsewhere,
ΔΔ
G
=
Δ
G
TP
binary
−Δ
G
TP
ternary
, and
Δ
G
TP
ternary
and
Δ
G
TP
binary
are the free
energies
of the PROTAC
(
P
) binding
to the target
protein
(
T
)
with
and without
the presence
of the E3 ligase
(
E
). Because
of
the thermodynamic
cycle
(Figure
1b),
the same
ΔΔ
G
can be
obtained
from
Δ
G
EP
binary
− Δ
G
EP
ternary
. Favorable
PPIs
stabilize
the
ternary
complex
and
facilitate
PROTAC
binding
to both
proteins.
Thus,
they
lower
Δ
G
TP
ternary
and
Δ
G
EP
ternary
, which
leads
to larger
ΔΔ
G
and more
positive
cooperativity.
Alchemical
free
energy
calculation
methods
exploit
alter-
native
thermodynamic
cycles
to obtain
ΔΔ
G
without
simulating
binding
and unbinding
processes.
For simplicity,
in this work,
all
ΔΔ
G
s
are calculated
using
the cycle
in Figure
1c, which
we describe
in detail
here,
but one should
arrive
at
the same
result
using
the mirroring
cycle
in Figure
1d. By the
definition
of a thermodynamic
cycle,
we
have
=
_
_
G
G
G
G
EP
EP
T
T
binary
ter
nary
couple
binary
couple
ternary
,
where
_
G
T
couple
binary
and
_
G
T
couple
ternary
represent
the free
energies
of
coupling
T
to
P
and to the target
PROTAC
bound
complex
EP
. In the initial
states
of both
coupling
processes
(vertical
processes
in red in Figure
1c),
T
is bound
to
P
but is a dummy
molecule
at an ideal
state.
Specifically,
multiple
harmonic
springs
connect
the binding
pocket
beads
in
T
to the warhead
bead
of
P
, and
T
itself
is an elastic
network
model
consisting
of
only
harmonic
springs.
All other
interactions
between
T
and
the rest
of the system,
whether
P
or
EP
, are turned
off.
Coupling
T
simply
means
turning
on these
intermolecular
interactions,
while
the
binding
pocket
springs
remain
unperturbed.
Attaching
a dummy
T
instead
of having
T
dissociated
results
in a systematic
error
in the horizontal
free
energies
of
EP
binding
(
G
EP
binary
and
G
EP
ternary
in Figure
1c) such
that
the
ΔΔ
G
is unaffected.
This
is because
the attachment
of dummy
T
occurs
via only
one bead
on
P
, except
which
there
are no
other
force
field
terms
involving
both
physically
present
beads
and dummy
beads.
In the configurational
partition
function,
energy
terms
describing
the geometries
of the physically
present
part of the system
can therefore
be separated
from
the
term
involving
the dummy
T
and the attachment
junction.
The
latter
term
is the same
whether
the physically
present
part is
P
or
EP
, such
that
the unphysical
contribution
from
attaching
dummy
T
cancels
out in
ΔΔ
G
.
Free
Energy
Calculations.
Alchemically
changing
a
protein
from
a dummy
state
to full coupling
involves
turning
on the interaction
potentials
between
the protein
and the rest
of the system
in the force
field.
The interactions
are turned
on
in stages
by sequentially
scaling
each
kind
of interaction
potential
using
a coupling
parameter
λ
. Intramolecular
potentials
(e.g.,
the elastic
network
model
of each
protein)
and
intermolecular
potentials
not perturbed
at the current
stage
are unaffected
by the
λ
scaling.
For the electrostatic
potential,
the start
state
(no electrostatics)
and the end state
(full
electrostatics)
correspond
to
λ
elec
= 0 and 1 respectively.
Intermediate
states
are interpolated
such
that the potential
is
defined
as
=
+
=
_
U
U
U
U
(1
)
elec
no
ele
c
elec
elec
elec
elec
elec
.
For
numerical
stability,
the electrostatic
potential
is only
perturbed
in the presence
of volume
exclusion,
50,51
which
is
modeled
by Weeks
Chandler
Andersen
(WCA)
potential.
To turn
on Lennard-Jones
(LJ)
or variants
of LJ potentials
(e.g.,
WCA),
a soft-core
scaling
52
with
λ
LJ
is used
for numerical
stability:
i
k
j
j
j
j
j
j
j
j
j
j
j
j
j
j
j
j
j
j
j
j
i
k
j
j
j
j
j
i
k
j
j
j
y
{
z
z
z
y
{
z
z
z
z
z
i
k
j
j
j
y
{
z
z
z
y
{
z
z
z
z
z
z
z
z
z
z
z
z
z
z
z
z
z
z
z
z
=
+
+
U
r
(
)
4
1
(1
)
1
(1
)
ij
r
r
LJ
LJ
6
2
LJ
6
ij
ij
ij
ij
LJ
where
α
= 0.5,
r
ij
is the distance
between
beads
i
and
j
, and
σ
ij
is the sum
of the radii
of beads
i
and
j
. The
number
of
intermediate
states
and the spacing
of the coupling
parameter
values
depend
on the difficulty
to obtain
converged
free energy
calculations.
For the electrostatic
potential,
a linear
pathway
where
λ
elec
ranges
from
0 to 1 with
a step
size of 0.125
is a
simple
and effective
approach.
For LJ and related
potentials,
The
Journal
of
Physical
Chemistry
B
pubs.acs.org/JPCB
Article
https://doi.org/10.1021/acs.jpcb.2c05795
J. Phys.
Chem.
B
2023,
127,
446
455
448
because
most
of the free energy
changes
occur
near
the start
state
of
λ
LJ
= 0 (Figure
2b,c),
we introduce
intermediate
states
at
λ
LJ
= 0.005,
0.01,
0.015,
0.02,
0.04,
0.06,
0.08,
0.1, 0.2, 0.3,
0.5, 0.7, and 0.9.
The
Δ
G
of turning
on each
kind
of interaction
is calculated
using
thermodynamic
integration
(TI),
53
Bennett
acceptance
ratio
(BAR)
method,
54
and
the multistate
BAR
(MBAR)
method.
55
TI and BAR/MBAR
are distinct
formulations
for
free
energy
calculations,
and
we verify
that
these
methods
converge
to similar
values.
The
system
in CGMD
is evolved
using
overdamped
Langevin
dynamics
with
a diffusion
coefficient
of 253 nm
2
/s and a time
step
of 30 ns for stable
integration.
At each
state,
at least
64 trajectories
of 6 s long
are
generated
to sample
the conformations
of the complexes.
After
collecting
the
samples
from
trajectories,
postprocessing
involves
calculating
U
/
∂λ
and
Δ
U
ij
for all
i
,
j
= 1, 2, ...,
K
states
as inputs
for TI, BAR,
and MBAR.
RESULTS
AND
DISCUSSION
Alchemical
Perturbation
of
Protein
Domains
Is
Feasible
with
CGMD.
The
binding
cooperativity
of
PROTAC
due to PPIs
is a unique
challenge
that calls
for an
unconventional
application
of alchemical
free
energy
calcu-
lation
methods.
Alchemical
methods
are mainly
used
to
determine
the
binding
energies
between
small-molecule
ligands
and proteins,
and typically
no more
than
10 heavy
atoms
are perturbed
for efficient
and accurate
calculations.
In
protein
protein
binding,
recent
applications
and development
focus
on quantifying
the relative
free
energy
changes
from
small-scale
perturbations
such
as mutations
of single
residues.
56
60
To our knowledge,
the only
case
that alchemi-
cally
calculates
PPIs
in a three-body
setting
compares
how
analogs
of inhibitors
change
aberrant
multimerization
of the
HIV-1
integrase.
61
Their
proposed
thermodynamic
framework
involves
calculating
the relative
free
energy
difference
by
perturbing
small
molecules
that directly
participate
at a fixed
PPI interface.
This
framework
is more
readily
extendable
to
molecular
glues
that
modulate
PPIs
in a similar
way.
PROTACs,
however,
due
to a more
modular
design,
are
typically
larger
linear
molecules.
The flexibility
of the linker
is
often
nontrivial,
such
that
the two proteins
cannot
be kept
bound
at a fixed
interface.
This
configurational
entropic
concern
necessitates
an unusually
large
perturbation
at the
scale
of a protein
rather
than
a small
molecule
to calculate
the
binding
cooperativity,
testing
the computational
limit
of
alchemical
methods.
To explore
the feasibility
of the CG alchemical
approach,
we
calculate
the free energy
of turning
on the steric
repulsions
between
a target
protein
and
a PROTAC
E3
complex
(
Δ
G
ternary(sterics)
) in the
absence
of other
intermolecular
Figure
2.
Calculation
of
Δ
G
ternary(sterics)
by alchemical
perturbation
of BTK
in the ternary
complex
of BTK-PROTAC
(10)-CRBN.
(a) TI and
MBAR
both
reach
apparent
convergence
in the time-forward
and time-reversed
directions
with
no pathological
signs.
The gray band
in each
panel
represents
the final estimation
using
100%
data
±
0.1
kT
as a threshold
for error
tolerance,
where
k
is the Boltzmann
constant.
(b) TI estimation
is
shown
as the blue area under
the curve
of
⟨∂
U
/
λ
.
(c) TI, BAR,
and MBAR
agree
for all intermediate
Δ
G
s
between
adjacent
states.
All error
bars
of computational
results
here
and in subsequent
figures
represent
±
1 std. Color
coding
for TI, BAR,
and MBAR
results
are the same
in subsequent
figures
unless
otherwise
stated.
The
Journal
of
Physical
Chemistry
B
pubs.acs.org/JPCB
Article
https://doi.org/10.1021/acs.jpcb.2c05795
J. Phys.
Chem.
B
2023,
127,
446
455
449