Fast, accurate ranking of engineered proteins by target-binding propensity using structure modeling

Original Article

Fast, accurate ranking of engineered proteins

by target-binding propensity

using structure modeling

Xiaozhe Ding,

Xinhong Chen,

Erin E. Sullivan,

Timothy F. Shay,

and Viviana Gradinaru

Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA

Deep-learning-based methods for protein structure prediction

have achieved unprecedented accuracy, yet their utility in the

engineering of protein-based binders remains constrained due

to a gap between the ability to predict the structures of candi-

date proteins and the ability toprioritize proteins by their po-

tential to bind to a target. To bridge this gap, we introduce

Automated Pairwise Peptide-Receptor Analysis for Screening

Engineered proteins (APPRAISE), a method for predicting

the target-binding propensity of engineered proteins. After

generating structural models of engineered proteins competing

for binding to a target using an established structure prediction

tool such as AlphaFold-Multimer or ESMFold, APPRAISE per-

forms a rapid (under 1 CPU second per model) scoring analysis

that takes into account biophysical and geometrical constraints.

As proof-of-concept cases, we demonstrate that APPRAISE can

accurately classify receptor-dependent vs. receptor-independent

adeno-associated viral vectors and diverse classes of engineered

proteins such as miniproteins targeting the severe acute respira-

tory syndrome coronavirus 2 (SARS-CoV-2) spike, nanobodies

targeting a G-protein-coupled receptor, and peptides that spe-

fi

cally bind to transferrin receptor or programmed death-

ligand 1 (PD-L1). APPRAISE is accessible through a web-based

notebook interface using Google Colaboratory (

https://tiny.cc/

APPRAISE

). With its accuracy, interpretability, and generaliz-

ability, APPRAISE promises to expand the utility of protein

structure prediction and accelerate protein engineering for

biomedical applications.

INTRODUCTION

Many protein-based biologics rely on precise targeting. As a result,

protein engineers have devoted considerable effort to create speci

fi

binders, using methods such as directed evolution

–

and rational

design.

–

Currently, the costly experimental evaluation of candidate

binders using

in vitro

and

in vivo

assays presents a bottleneck, which

can be eased using computational prioritization.

Two strategies are employed to predict protein functions: end-to-end

sequence-function and two-step sequence-structure/structure-func-

tion. End-to-end sequence-function models can predict complex

functions such as enzyme activities or ion channel conductivity,

which are challenging to calculate using physical principles.

Howev-

er, such specialized models require domain-speci

fi

c, high-quality

training datasets for accurate prediction. In comparison, the two-

step sequence-structure/structure-function strategy offers a more

generalizable solution, particularly for functions with well-under-

stood biophysical mechanisms such as protein-protein binding.

The rapid development of deep-learning-based methods has brought

unprecedented accuracy to the

fi

rst step of the sequence-structure/

structure-function strategy. Since AlphaFold2 (AF2)

’

s outstanding

performance in CASP14 in 2020,

several new deep-learning-based

structure prediction tools have been released,

–

providing a diverse

toolset for generating protein models with atomic-level precision.

While the original AF2 can predict protein-protein complexes,

there are enhanced versions such as AlphaFold-Multimer that can

model multi-chain complexes with greater accuracy.

Importantly,

these structure prediction tools allow the generation of models in less

than one GPU hour each, a level of throughput that experimental

methods cannot match.

The second step, ranking target-binding propensities based on struc-

ture predictions, has been less attended to than the

fi

rst. Structure pre-

diction tools generate con

fi

dence scores for predicted multimer

models, such as predicted local distance difference test (pLDDT)

and predicted template modeling (pTM) scores (used by AF2),

and interface pTM scores (used by AF-Multimer),

which have

been used off label as metrics to evaluate the probability of bind-

ing.

However, previous reports

and our experience revealed

that these scores alone are, in some cases, not re

fl

ective of binding

propensities, particularly when the interaction is weak or transient.

Extracting additional information stored in the 3D coordinates using

biophysical principles may help improve the accuracy of binder

ranking.

Received 17 October 2023; accepted 3 April 2024;

https://doi.org/10.1016/j.ymthe.2024.04.003

Correspondence:

Xiaozhe Ding, Division of Biology and Biological Engineering,

California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA

91125, USA.

E-mail:

xding@caltech.edu

Correspondence:

Viviana Gradinaru, Division of Biology and Biological

Engineering, California Institute of Technology, 1200 E California, Boulevard,

Pasadena, CA 91125, USA.

E-mail:

viviana@caltech.edu

Molecular Therapy Vol. 32 No 6 June 2024

ª

2024 The Author(s). 1

This is an open access article under the CC BY license (

http://creativecommons.org/licenses/by/4.0/

Please cite this article in press as: Ding et al., Fast, accurate ranking of engineered proteins by target-binding propensity using structure modeli

ng, Molecular

Therapy (2024), https://doi.org/10.1016/j.ymthe.2024.04.003

Ranking the binding probability of engineered proteins through

modeled structures presents unique challenges. A frequent challenge

is imposed by the high sequence similarity between candidate mole-

cules. Engineered protein variants are often constructed by modifying

a short variable region in a common scaffold. Due to this similarity,

the energy difference between the candidate binders can be very small,

sometimes buried in the error of the energy function used for candi-

date ranking.

This problem is compounded by structure predic-

tion methods that rely heavily on co-evolutionary information or ho-

mology, causing them to generate similar binding poses for the

candidate proteins. Another major challenge is assessing a large num-

ber of predicted structure models ef

fi

ciently. Direct quanti

fi

cation of

protein-protein interface energy using interpretable, physics-based

methods trades off between accuracy and speed.

For instance, mo-

lecular dynamics simulation methods can cost more than 10

CPU

hours per model. Faster, less rigorous methods with better-than-

random ability to predict the impact of interface mutations still

require 1 CPU minute to 1 CPU hour per non-antibody-antigen

model.

In the post-AlphaFold era, an interpretable and ef

fi

cient

method of predicting the target binding of a large number of models

would greatly accelerate protein engineering efforts.

Recently, Chang and Perez utilized competitive modeling with AF-

Multimer to demonstrate a correlation between competition results

and peptide-binding af

fi

nities.

However, the study

’

s method of as-

sessing the competition results necessitates a comparison of the

modeled structures to an experimentally solved "native" structure,

which is not available for many engineered proteins.

To bridge the remaining gap between structure prediction and pro-

tein engineering, here we present Automated Pairwise Peptide-

Receptor Analysis for Screening Engineered proteins (APPRAISE),

a readily interpretable and generalizable method for ranking the

target-binding propensity of engineered proteins based on competi-

tive structure modeling and fast physics-informed structure analysis.

RESULTS

The work

fl

ow of APPRAISE (

Figure 1

) comprises four main compo-

nents. In the

fi

rst step, pairs of peptides from

candidate protein

molecules (

pairs total) are modeled in complex with a target re-

ceptor using a state-of-the-art structure method such as AF-Multi-

mer.

In the second stage, a simpli

fi

ed energetic binding score is

calculated for each peptide (i.e., the peptide of interest [POI] and

its competitor). In the third optional step, geometrical constraints

for effective binding are applied to these scores. Finally, the result

of each competition is decided using the score difference between

the POI and the competitor, and the peptides are ranked based on

the matrix of competition results.

APPRAISE can accurately classify receptor-mediated brain

transduction of viral vectors

fi

rst developed APPRAISE to predict the binding propensities of

engineered adeno-associated virus (AAV) capsids for brain receptors.

RecombinantAAVsarewidely used as delivery vectors for genetherapy

due to their relative safety as well as their broad and engineerable

tropism.

In vivo

selections from libraries of randomized peptide-dis-

playing AAV variants have yielded capsids that can transduce the

animal brain,

–

an organ tightly protected by the blood-brain bar-

rier (BBB). Widely known examples among these capsids are AAV-

PHP.B

and AAV-PHP.eB,

two AAV9-based

variants displaying

short (7

–

9 amino acids) surface peptides. The two variants can ef

fi

ciently deliver genetic cargo to the brains of a subset of rodent strains.

Genetic and biophysical studies have revealed that the BBB receptor for

PHP.B/PHP.eB in these strains is LY6A, a GPI-anchored membrane

Figure 1. Workflow of APPRAISE

First, engineered protein candidates or peptides from the protein candidates’ target-binding region are modeled in competing pairs with the target

receptor using tools such

as AF-Multimer or ESMFold. Second, a non-negative energetic binding score based on atom counting is calculated for each peptide. Third, in APPRAISE 1

.1+, additional

geometrical constraints critical for peptide binding, including the binding angle and pocket depth, are considered. Finally, a relative score for e

ach match is calculated by

taking the difference between the scores for the two peptides. The averaged relative scores form a matrix that determines the final ranking.

Molecular Therapy

2 Molecular Therapy Vol. 32 No 6 June 2024

Please cite this article in press as: Ding et al., Fast, accurate ranking of engineered proteins by target-binding propensity using structure modeli

ng, Molecular

Therapy (2024), https://doi.org/10.1016/j.ymthe.2024.04.003

(legendonnextpage)

www.moleculartherapy.org

Molecular Therapy Vol. 32 No 6 June 2024 3

Please cite this article in press as: Ding et al., Fast, accurate ranking of engineered proteins by target-binding propensity using structure modeli

ng, Molecular

Therapy (2024), https://doi.org/10.1016/j.ymthe.2024.04.003

protein.

–

A dataset comprising peptide-displaying AAV capsids

that were engineered in a similar way as PHP.B/eB was collected in

order to train the APPRAISE method (

Figure S1

). Although binding

between the AAV and the LY6A receptor is dynamic

and

therefore challenging to quantitatively measure, we could infer the bi-

nary LY6A-binding pro

fi

les of AAV capsids from their differential

braintransductionpro

fi

lesinmouse strainswithand without the recep-

tor, producing a training set of peptide-displaying AAV capsids

(

Figure S1

One challenge for modeling AAV capsids is that they are huge com-

plexes made of 30,000+ amino acids (aa). In order to reduce compu-

tational costs for structure modeling and avoid complications arising

from non-speci

fi

c interactions, we modeled each AAV capsid variant

using a single peptide spanning the engineered region (

Figure 2

A).

This peptide (residues 587

–

594 in the VP1 sequence) includes seven

inserted residues and eight contextual residues

fl

anking the insertion.

All of these residues are surface exposed and may make direct contact

with the receptor in the assembled capsid. Modeling this surface pep-

tide (15 aa) is far less computationally intensive than modeling the

entire capsid or even an asymmetric capsid subunit (500+ aa). In

addition, compared to the latter, it may improve accuracy by elimi-

nating competing interactions of residues normally buried in inter-

subunit interfaces.

To discriminate relatively small differences in receptor-binding pro-

pensities of candidate peptides, we modeled the peptides pairwise in

competition for the target receptor.

To evaluate the competition

results ef

fi

ciently, we designed a score based on simple atom counting

as a rough estimate of the interface free energy between the POI and

the receptor in a structure model (

Figure 2

B). This score, which we

term the energetic binding score (

POI

energetic

, simpli

fi

ed as

POI

), is a

non-negative value calculated from the numbers of contacting and

clashing atoms at the interface (

Equation 1

). Upon analyzing the dis-

tribution of

POI

for PHP.eB and AAV9 in our LY6A-binding AAV

dataset, we observed an expected disparity in the distribution of the

two variants. Speci

fi

cally, LY6A binder PHP.eB consistently obtained

higher

POI

compared to non-binder AAV9 in our competitive

modeling results (

Figure S2

). We describe the detailed rationale

behind this score in the section

“

materials and methods

”

POI

energetic

max

POI

contact

$

POI

clash

;

(Equation 1)

To take full advantage of the information encoded in the competitive

models, we further derived a "relative binding score,

”

inspired by the

"speci

fi

city strategy" for protein-protein interface design.

The rela-

tive score takes the difference between the absolute scores for the

POI and competitor peptide (

Equation 2

), rewarding POIs destabiliz-

ing competing peptides

’

binding.

POI

;

competitor

POI

competitor

(Equation 2)

An engineered protein must meet certain geometrical constraints

to effectively bind to a membrane receptor (

Figure 2

C). To

utilize this geometrical information, which is likely unused by struc-

ture prediction tools, we incorporated two essential constraints

for effective binding: the binding angle and the binding depth

(

Figures 2

–

2E).

The

fi

rst constraint comes from the angle a binding protein can make

(

Figures 2

C and 2D). In modeling a peptide-receptor complex using

the extracellular domain of the membrane receptor (e.g., LY6A), most

structure predictors (e.g., AF-Multimer) would consider the whole

surface of the domain to be accessible by the peptide. However, in bio-

logical conditions, the membrane-facing side of the target receptor is

inaccessible to the engineered peptide. This polarity of accessibility is

a general property of any target receptor that is closely anchored to a

larger complex. To account for the potentially huge energy cost of

an engineered peptide binding these inaccessible locations, we used

a steep polynomial term to penalize peptides that bind to the

anchor-facing part of the receptor (

Figure 2

D, de

fi

ned in the section

Figure 2. Binary classification of receptor-binding AAV capsids using physical and geometrical principles

(A) A structure model of AAV-PHP.eB, highlighting the site for inserting the displayed peptide (orange) and the peptide used for APPRAISE modeling (y

ellow or orange). The

left image shows the AAV capsid of 60 structurally identical subunits. The two images on the top right show a top view and a side view around the 3-fold axi

s, respectively.

The three subunits that make the trimer are colored blue, cyan, and white. The sequence corresponding to the peptides is shown in the bottom right. (B) A

n example showing

the calculation process of a relative energetic binding score. The number of contacting atoms (

A) and the number of clashing atoms (

A) for each peptide in the

competition are counted, and an absolute energetic binding score is calculated based on the counts according to

Equation 1

. A difference between the two numbers, or the

relative energetic binding score, is then calculated. The competition result between two peptides is determined using the average of relative bindi

ng scores across replicates.

The matrix of the mean scores is then used to rank the peptides of interest (POIs). (C) A simplified geometrical representation of a peptide-receptor mo

del, where the hull of the

receptor is represented by an ellipsoid (blue). Point O, the center of mass of the receptor; point A, the receptor’s terminus attached to an anchor; seg

ment OB, the minor axis

of the ellipsoid receptor hull; point C, the deepest point on the candidate peptide (orange);

, the binding angle of the peptide;

, the binding pocket depth of the peptide. (D)

The angle constraint function. Three representative scenarios with different binding angles are highlighted. (E) The depth constraint function. T

hree representative scenarios

with different binding depths are highlighted. (F) Comparison of the averaged relative binding energy scores before geometry-based adjustments vs

. after adjustments. (G–I)

Heatmaps representing the matrix of mean scores of 22 AAV9-based capsid variants, including (G) mean absolute binding scores, (H) mean relative bind

ing scores, and

(I) mean relative binding scores that have considered both angle and depth constraints. All heatmap matrices were sorted by point-based round-robin

tournaments (section

“

materials and methods

”). Bracketed numbers in the row labels are LY6A-binding profiles of the capsids inferred from experimental evidence (

Figure S1

). Each block in the

heatmap represents the mean score measured from 10 independent models generated by AlphaFold-Multimer. (J–K) Comparison of different ranking meth

ods used as

binary classifiers to predict the LY6A-binding profile of 22 AAV9-based capsid variants. (J) Comparison between rankings given by different versions

of APPRAISE scores

using AF-Multimer as the structure prediction tool. (K) Comparison between rankings given by confidence scores of AF-Multimer versus rankings given

by APPRAISE 1.2

using either AF-Multimer or ESMFold as prediction engines. The sequence and shape parameters of LY6A used for the modeling and analyses are included i

Table S1

Molecular Therapy

4 Molecular Therapy Vol. 32 No 6 June 2024

Please cite this article in press as: Ding et al., Fast, accurate ranking of engineered proteins by target-binding propensity using structure modeli

ng, Molecular

Therapy (2024), https://doi.org/10.1016/j.ymthe.2024.04.003

“

materials and methods

”

Equation 6

POI

is adjusted by

this geometrical constraint term, recti

fi

ed to be non-negative, and

POI

;

competitor

is also re-calculated accordingly, yielding new scores

POI

and

POI

;

competitor

(

Equation 3

POI

;

competitor

POI

competitor

max

POI

energetic

POI

angle

;

max

competitor

energetic

competitor

angle

;

(Equation 3)

The second constraint concerns the binding pocket depth (

Figures 2

and 2E). We hypothesized that peptides binding to a deeper pocket on

the receptor surface might bene

fi

t from longer residence time, which

is vital for the ef

fi

cacy of many therapeutics.

Based on this hypoth-

esis, we included a pocket depth consideration in APPRAISE

’

scoring function. We used a relative pocket depth measurement

instead of an absolute peptide-receptor distance measurement to

avoid possible bias caused by the sizes of different target receptors.

We then used an odd polynomial term to reward peptides that insert

into deep pockets on the receptor while penalizing peptides that atta-

ch to surface humps (

Figure 2

E, de

fi

ned in the section

“

materials and

methods

”

Equation 7

). The addition of the depth term gives us an

adjusted score

POI

;

competitor

(

Equation 4

POI

;

competitor

POI

competitor

max

POI

energetic

POI

angle

POI

depth

;

max

competitor

energetic

competitor

angle

competitor

depth

;

(Equation 4)

We compared different versions of scoring methods based on compet-

itive modeling results using AF-Multimer modeling (

Figures 2

–

2I).

Individual matching scores with statistical signi

fi

cance were used to

determine wins and losses, and the total matching points in a tourna-

ment were used to rank all candidate proteins (section

“

materials and

methods

”

). We found that simple atom-counting-based

POI

can

already differentiate LY6A-binding peptides from non-binders

(

Figures 2

G and 2J). Compared to

POI

alone, the relative score

POI

;

competitor

showed improved prediction power, a receiver operating

characteristic (ROC) area under the curve (AUC) of 0.800 and an area

under precision-recall curve (AUPRC) of 0.756 for the training dataset

(

Figures 2

–

2K). Adding both geometrical terms,

angle

and

depth

, into

consideration indeed improved the prediction accuracy of the binding

score (

Figures 2

–

2K), yielding an ROC AUC of 0.838 and an

AUPRC of 0.845 (

Figures 2

J and 2K). Importantly, the improvement

in ROC AUC mainly came from the low-false-positive-rate segment

of the ROC curve, which is crucial for

in silico

screening of engineered

proteins. For clarity, we name the version that considers only the angle

constraint (through score

)APPRAISE1.1(

Figure S3

A) and the

version that considers both angle and depth constraints (through score

)APPRAISE1.2(

Figure 2

I).

We then compared AF-Multimer-based APPRAISE 1.2 with other

structure-based peptide af

fi

nity ranking methods on the AAV dataset

(

Figure 2

K). With this particular dataset, the model con

fi

dence scores

pLDDT, pTM, and interface pTM failed to differentiate whether an

AAV variant is an LY6A binder, producing worse-than-random pre-

diction (ROC AUC

5). This is possibly due to the dynamic nature

of the interaction between LY6A-binding AAV variants and the re-

ceptor,

which causes the con

fi

dence scores of the complex models

to be generally low. Meanwhile, APPRAISE 1.2 utilizing ESMFold as

the structure prediction engine performed at a comparable level to

AF-Multimer-APPRAISE 1.2 (

Figure S3

B), with an ROC AUC of

0.895 and AUPRC of 0.818 (

Figure 2

K).

AF-Multimer-APPRAISE 1.2 ranking outperformed all other ranking

methods at the low false positive rate end of the ROC curve, with a

true-positive rate of 0.714 and no false-positive predictions. The per-

formance with stringent cutoff values is particularly relevant for pro-

tein engineering applications, where the goal is typically to identify a

few positive binders from many negative, non-binding candidates.

The superiority of AF-Multimer-APPRAISE 1.2 in dealing with this

kind of imbalanced library is also shown by its highest AUPRC.

Because of this, we chose to characterize AF-Multimer-APPRAISE

1.2 further. In the following text,

“

APPRAISE

”

will be used to refer

to AF-Multimer-APPRAISE 1.2 unless otherwise speci

fi

ed.

APPRAISE is generally applicable to diverse classes of

engineered proteins

To determine the applicability of APPRAISE to different classes of en-

gineered proteins, we applied the method to four classes of engineered

protein binders targeting four representative targets for therapeutics.

fi

rst applied APPRAISE to other short peptide binders

(

Figures 3

–

3D). In the

fi

rst trial, the method successfully ranked a

peptide selected by phage display to bind human transferrin recep-

tor,

a well-characterized BBB receptor, over non-binding counter-

parts from the same selection

(

Figure 3

A). In the second trial, we

evaluated two 47 aa, rationally designed programmed death-ligand

1 (PD-L1)-binding peptides

against the scaffold and length-matched

AAV variable region fragments. Both designed PD-L1-binding pep-

tides were clear winners, with the higher-af

fi

nity MOPD-1 peptide

topping the list despite a high degree of sequence similarity

(

Figures 3

C, and

A).

We next tested whether APPRAISE can be used to evaluate larger

proteins; for example, computationally designed miniproteins (50

–

90 aa) that bind to the receptor-binding domain (RBD) of SARS-

CoV-2 spike protein

(

Figures 3

–

3G). Among the designed mini-

proteins,

fi

ve can neutralize live SARS-CoV-2 virus

in vitro

with

half maximal inhibitory concentration (IC

) from 20 pM to

40 nM

The APPRAISE rankings of the

fi

ve neutralizing miniproteins

matched well with their IC

rankings (Spearman

’

90,

037;

Figure 3

G). The predictive accuracy of APPRAISE decreased

when non-neutralizing miniproteins

and control AAV fragments

were included (Spearman

’

88,

001;

Figure 3

G);

www.moleculartherapy.org

Molecular Therapy Vol. 32 No 6 June 2024 5

Please cite this article in press as: Ding et al., Fast, accurate ranking of engineered proteins by target-binding propensity using structure modeli

ng, Molecular

Therapy (2024), https://doi.org/10.1016/j.ymthe.2024.04.003

(legendonnextpage)

Molecular Therapy

6 Molecular Therapy Vol. 32 No 6 June 2024

Please cite this article in press as: Ding et al., Fast, accurate ranking of engineered proteins by target-binding propensity using structure modeli

ng, Molecular

Therapy (2024), https://doi.org/10.1016/j.ymthe.2024.04.003

nevertheless, the top four binders still remained on the top. In

contrast, the ranking given by the interface pTM (ipTM) score of

AF-Multimer only achieved a Spearman

’

of 0

67 (

035)

(

Figure S5

C).

We also used APPRAISE to rank six nanobodies (120 aa) that were

evolved experimentally

with highly similar scaffolds (

Figure S4

to bind to an activated conformation of

adrenergic receptor

(

AR), a G-protein-coupled receptor (GPCR) (

Figures 3

–

3J).

APPRAISE correctly found the strongest evolved binder and placed

the parent(the weakestbinderamongallcandidates)atthe bottom(

Fig-

ure 3

H). The overall predicted ranking correlated well with the ranking

from experimentally determined binding readouts

(Spearman

’

89,

02;

Figure 3

J), surpassing the prediction given by the

ipTM score of AF-Multimer (Spearman

’

49,

329;

Fig-

ure S5

F). Our hypothesis for why APPRAISE is effective in predicting

challenging targets is that introducing a competitive protein compels

the AlphaFold network to choose a higher-probability binder between

two similar options, thereby amplifying the signal. In line with our hy-

pothesis, our evaluation of the binding energy

of AlphaFold predicted

models in both the single-POI setting (

Figures S6

A and S6B) and the

competitive-binding setting (

Figures S6

C and S6D) revealed that the

involvement of the competitive protein indeed improved the predictive

power of the modeling results. Although competitive modeling has

enabled the ranking of nanobodies in this particular instance, it is

important to recognize that predicting adaptive immune complexes,

particularly larger ones such as immunoglobulin (Ig) G-antigen com-

plexes, still presents a signi

fi

cant challenge. Further advancements in

the underlying structure prediction methods will enable APPRAISE

to generalize the ranking capability to these challenging targets.

To evaluate the cross-target capabilities of APPRAISE, we used the

method to rank eight recently developed miniproteins binding eight

different therapeutically signi

fi

cant target receptors.

This ranking

included all target receptors with a ligand-binding domain that is

smaller than 250 aa in the Cao et al. study. APPRAISE accurately

identi

fi

ed the correct binder within the top three in every instance,

and, six out of eight times, the correct binder was ranked as the top

one (

Figures 3

K and

We next compared the performance of AF-Multimer-APPRAISE 1.2

to alternative methods on both the miniprotein dataset and the nano-

body datasets. AF-Multimer-APPRAISE 1.2 again yielded the most

accurate predictions when compared to AF-Multimer-APPRAISE

1.0, ESMFold-APPRAISE 1.2, or interface pTM scores given by AF-

Multimer (

Figure S5

), re

fl

ected by higher Spearman

’

s correlation to

experimental rankings. ESMFold-APPRAISE 1.2 failed completely

with the miniprotein dataset (

Figure S5

B). Upon further inspection,

we found that the unfolded SARS-CoV-2-S RBD structure in

ESMFold-generated complex models can explain the failed ranking

prediction.

Without any

fi

ne-tuning, AF-Multimer-APPRAISE 1.2 demonstrated

consistent prediction ability for ranking all four classes of proteins,

including experimentally selected and rationally designed peptides,

computationally designed miniproteins, and nanobodies. Realizing

the potential general applicability of the APPRAISE method, we

have created a web-based notebook interface to make it readily acces-

sible to the protein engineering community (

Figure S8

https://tiny.

cc/APPRAISE

HT-APPRAISE screening can identify novel receptor-dependent

capsid variants

We next adapted APPRAISE for

in silico

screening. The computa-

tional cost in the pairwise competition mode grows quadratically

with the number of input variants, which is unsuitable for high-

throughput screening. To address this scalability issue, we designed

a two-stage screening strategy named high-throughput (HT)-

APPRAISE (

Figure 4

A). The

fi

rst stage aims to shrink the size of

the variant library using a less accurate yet more scalable strategy.

Figure 3. AF-Multimer-APPRAISE 1.2 accurately ranks binding propensities of different classes of engineered proteins

(A and B) APPRAISE ranking of transferrin receptor-binding peptides and non-binding control peptides.

(A), Pairwise score matrix and ranking of a panel of 12-aa peptides

given by APPRAISE. Bracketed numbers in the row labels are experimentally determined transferrin receptor-binding profiles of each peptide.

(B) A representative AF-

Multimer model result of a binding peptide (blue) competing against a non-binding peptide (red) for binding to transferrin receptor. (C and D) APPRAI

SE ranking of PD-

L1-binding peptides and non-binding control peptides.

the row labels show the PD-L1-binding profile of each peptide determined either experimentally (for MOPD-1, MNPD-1, and scaffold protein) or by expec

tation (for AAV9 and

PHP.eB).

(D) A representative AF-Multimer model result of MOPD-1 (blue), a designed binding peptide, competing against a non-binding scaffold peptide (red)

for binding to

PD-L1. (E–G) APPRAISE ranking of SARS-CoV-2-S RBD-binding miniproteins.

(E) Pairwise score matrix and ranking given by APPRAISE. Bracketed rankings in the row

labels are determined based on experimentally measured IC

of each miniprotein to neutralize live SARS-CoV-2.

(F) A representative AF-Multimer model result of LCB1

(blue), a SARS-CoV-2-S RBD-binding miniprotein, competing against an influenza virus-binding miniprotein

(red). (G) A scatter plot showing the correlation between

APPRAISE-predicted ranking and experimentally measured IC

ranking of all miniproteins tested. Blue points highlight binders that showed the capability of complete

neutralization of the SARS-CoV-2 virus in the tested range of concentration

in vitro

. (H–J) APPRAISE ranking of

adrenergic receptor-binding nanobodies.

(H) Pairwise

score matrix and ranking given by APPRAISE. Bracketed numbers in the row labels are rankings of experimentally measured binding of each nanobody.

(I) A representative

AF-Multimer model result of Nb6B9 (blue), the strongest evolved binder to active

AR, competing against Nb80 (red), the parent nanobody used as the starting point for the

evolution. (J) A scatterplot showing the correlation between APPRAISE-predicted ranking and experimentally measured ranking by

AR binding of all nanobodies tested.

Each block in the heatmap represents the mean score measured from 10 structural models generated by AlphaFold-Multimer. For comparison, rankings gi

ven by AF-

Multimer-APPRAISE 1.0, ESMFold-APPRAISE 1.2, and interface pTM of SARS-CoV-2-S RBD-binding miniproteins and

adrenergic receptor-binding nanobodies are

shown in

Figure S5

. (K) A summary of APPRAISE rankings of eight miniproteins

designed to bind to eight different target receptors.

Figure S7

displays the score matrices

utilized for rankings with individual target receptors.

Tables S1

and

include sequences and shape parameters of all target receptors.

Table S3

includes sequences of all

engineered proteins.

www.moleculartherapy.org

Molecular Therapy Vol. 32 No 6 June 2024 7

Please cite this article in press as: Ding et al., Fast, accurate ranking of engineered proteins by target-binding propensity using structure modeli

ng, Molecular

Therapy (2024), https://doi.org/10.1016/j.ymthe.2024.04.003