Identification of
orphan
ligand
-
receptor relationships using a cell
-
1
based
CRISPRa
e
nrichment
s
creening platform
.
2
3
4
Dirk H. Siepe
1
, Lukas T. Henneberg
1
,
Steven C. Wilson
1
,
Gaelen T. Hess
4
, Michael C.
Bassik
4
,
5
Kai Zinn
2
, K.
Christopher Garcia
1,3,5,6
6
7
8
1
Department of Molecular and Cellular Physiology, Stanford University School of Medicine,
9
Stanford, CA 94305, USA.
10
2
Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA
11
91125, USA.
12
3
Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA 94305,
13
USA.
14
4
Stanford ChEM
-
H, Stanford University, Stanford, CA 94305, USA; Department of Genetics,
15
Stanford University, Stanford, CA 94305, USA.
16
5
Department of Structura
l Biology, Stanford University School of Medicine, Stanford, CA 94305,
17
USA.
18
6
Lead Contact.
19
20
21
22
23
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint
ABSTRACT
1
2
Secreted proteins
, which include
cytokines, hormones and growth factors,
are
extracellular
3
ligands that control
key signaling pathways
mediating
cell
-
cell communication
within and between
4
tissues
and organs
. M
any
drugs target secreted ligands and their cell
-
surface receptors.
Still,
5
there are
hundreds of secreted
human
proteins that either have no identified receptors (“orphans”)
6
and
are likely to
act through cell surface
receptors
that have not yet been characterized
.
Discovery
7
of secreted ligand
-
receptor interactions by high
-
throughput screening has been problematic,
8
b
ecause the most commonly used high
-
throughput methods for protein
-
protein interaction (PP
I
)
9
screening do not work well for extracellular interactions.
Cell
-
based screening is a promising
10
technology for definition of new ligand
-
receptor interactions, becaus
e multimerized ligands can
11
enrich for cells expressing
low affinity
cell
-
surface receptors, and such methods do not require
12
purification of receptor extracellular domains. Here,
we
present
a
proteo
-
genomic
cell
-
based
13
CRISPR
activation (CRISPRa)
e
nrichment
s
creening platform
employing customized pooled cell
14
surface receptor sgRNA libraries
in combination with a
magnetic bead
selection
-
based
15
enrichment workflow
for rapid, parallel ligand
-
receptor deorphanization.
We
curated
80
potentially
16
high
value orphan
secreted proteins
and ultimately screened 2
0
secreted ligands
against two
cell
17
s
g
R
N
A
libraries with targeted
expression
of all single
-
pass (TM1) or multi
-
pass (TM2+) receptors
18
by CRISPRa
. We identified
previously unknown
interactions in 1
2
of thes
e
s
creens, and validated
19
several
of them using surface plasmon resonance and/or cell binding.
The newly deorphanized
20
ligands include three
receptor tyrosine phosphatase
(
R
P
T
P
)
ligands and a
chemokine
like protein
21
that binds to killer cell inhibitory receptors
(
K
I
R
’
s
)
.
Th
ese new interactions provide a
resource
for
22
future investigations of
interactions between the
human secreted and membrane proteome
s.
23
24
Key words: CRISPRa, pooled library, Cell
-
surface, protein
-
protein interaction, screen, receptor,
25
ligand, protein communit
ies, secreted proteome.
26
27
28
29
30
31
32
33
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint
INTRODUCTION
1
2
The human proteome can be
envisioned as a
n array
of nodes
grouped into local communities
,
3
where each node represents one protein and each local community represents a protein complex
4
or network
(Budayeva and Kirkpatrick, 2020; Huttlin et al., 2017)
. These communities determine
5
physiological function and sub
cellular localization.
Many communities include secreted protein
6
ligands
,
their cell
-
surface receptors,
and
signaling molecules that bind to the rece
ptors.
The
7
human secretome on its own constitutes approximately 15% of all human genes
and encode
s
8
more than 4000 different proteins
(Uhlén et al., 2019)
with a wide
range
of tissue expression
9
(Figure 1B)
. Most of the new drugs developed in recent years target secreted proteins and their
10
receptors, and new therapeutic targets are likely to emerge from screens to identify ligand
-
11
receptor interactions
(Clark et al., 2003; Stastna and Van Eyk, 2012).
12
Mapping
of
interactions
that occur at the cell surface
has
significantly
lagged behind
that
of
13
intracellular interactions
,
because the most widely used
high
-
throughput
protein
-
protein
14
interaction
(PP
i
)
screening methods, including affinity purification/mass spectrometry (AP/MS),
15
yeast two
-
hybrid screening (Y2H),
and
phage display, are not well suited to analysis of
16
extracellular domain (
ECD
)
interactions
(Havugimana et al., 2012; Huttlin et al., 2015; Krogan et
17
al., 2006; Martinez
-
Martin, 2017)
.
ECD
interactions are often of low affinity, with K
D
s in the
18
micromolar range, and
can
have fast dissociation rates, rendering them difficult to detect since
19
they
may
not produce stable complexes
(Honig and Shapiro, 2020)
.
As a consequence
,
ECD
20
interactions
are
generally
underrepresented in screens that rely on the formation of such
21
complexes
(Braun et al., 200
9; Martinez
-
Martin et al., 2019; Özkan et al., 2013c; Söllner and
22
Wright, 2009; Wojtowicz et al., 2020)
.
In addition, many putative
ECD
interactions
reported
by
23
AP/MS and Y2H
protein interaction databases
have the tendency to be
false positives. AP/MS
24
produces false positives for
cell
-
surface proteins
due to incomplete solubilization of membranes,
25
leading to identification of indirect interactions
. Y2H examines
interactions inside the cell
, but most
26
ECDs
have disulfide bonds and glycosylation sites. To acquire these modifications and fold
27
correctly, ce
ll
-
surface and secreted proteins must move through the secretory pathway. Because
28
of this,
ECD
interactions
detected
by
Y2H are often false positives due to domain misfolding.
29
Similar issues apply to phage display and
to
microarrays
in which mRNAs are tran
slated on a
30
chip.
Thus, while these high
-
throughput methods can identify interactions with the cytoplasmic
31
domains of receptors, they usually fail to find
genuine
ECD interactions.
32
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint
Successful
high
-
throughput
screens to detect weak ECD interactions in
vitr
o
have taken
1
advantage of avidity effects by expressing ECDs as fusions with multimerization domains. In such
2
binary interaction
screens, one protein (the bait) is applied to a surface, and the other (the prey)
3
is in solution.
Prey binding to the bait is a
ssessed using colorimetric or fluorescent detection.
4
These methods include AVEXIS,
ap
ECIA,
alpha
-
Screen and
BPIA, which are carried out using
5
ELISA plates, chips, or bead
s
(Braun et al., 2009; Bushell et al., 2008; Li et al., 2017; Martinez
-
6
Martin, 2017; Taouji et al., 2009)
. However,
in vitro screens have limitations
. They
require robotic
7
high
-
throughput instrumentation
and
are
time
-
consuming
and expensive
to carry out on a large
8
scale, since they require synthesis of
ECD coding regions and expression of individual bait and
9
prey proteins
.
In addition, in vitro screens cannot usually assess binding to ECDs of receptors
10
that span the membrane multiple times, b
ecause such ECDs are often composed of
11
noncontiguous loops and cannot be easily expressed in a soluble form. Furthermore
,
in vitro
12
binary
interaction mapping technologies
lack the natural spatial context of the cell membrane
.
13
They may also fail in cases wh
ere
cofactors and
/or
post
-
translational modifications
are required
14
for binding
.
15
To address these issues, several groups have developed cell
-
based screens for receptor
-
16
ligand interactions that take advantage of CRISPR technology
(Cong et al., 2013; Jinek et al.,
17
2013; Mali et al
., 2013)
. In
CRISPR
activation (CRISPRa)
screens such as the one described
18
here,
i
n
d
u
c
e
gene
express
i
o
n
by targeting transcriptional activators to their control elements using
19
sgRNAs
(Chong et al., 2018; Kampmann, 2018; Morgens et al., 2016; Tanenbaum et al., 2014a)
.
20
Utilizing
CRISPRa
pooled
sgRNA libraries eli
minates the need to create expensive
collection
of
21
synthetic genes
,
and in addition allows a forward positive screening workflow which enables a
22
higher
dynamic
range compared to loss of function screens
(Doench, 2018)
.
Libraries of cel
ls,
23
each with an sgRNA targeting one receptor, can be
e
a
s
i
l
y
stored and screened for binding to
24
soluble ligands.
25
Here we describe a
CRISPRa enrichment
workflow
that
employ
s
customized, pooled cell
26
surface
receptor
sgRNA libraries in combination with magnetic bead
-
based selection
(MACS) to
27
enrich for receptor
-
expressing cells
.
This approach allows
cost
-
efficient
parallel screening with
28
multiple ligands
.
We created two cell libraries, comprising all single
-
transmembr
ane (TM1) and
29
multi
-
pass
transmembrane (TM2+) receptors, and screened them with a collection of secreted
30
ligands. To define a set of high
-
priority ligands, we first curated the human secreted proteome
31
and
selected
and
express
ed
2
0
at levels sufficient for
screen
ing
the TM1 and TM2+ libraries.
We
32
identified new receptor candidates in more than half of these screens. These were validated using
33
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint
surface plasmon resonance (SPR) and/or cell binding.
These studies define new receptors for
1
several secreted ligands
that function in the immune and nervous systems.
2
3
RESULT
S
4
5
A CRISPR
activating enrichment
screening platform
.
6
7
CRISPRa mediated activation of transcription using the sunCas9 system
is
a precise and
8
scalable method
for inducing expression of
endogenous genes across a high dynamic range
9
(Gilbert et al., 2014a)
.
This system
uses a dead Cas9 (dCas9) variant fused to a SunTag, a
10
multicopy
epitope
t
ag
that
recruits the VP64 transcriptional activator
via binding to a
cytoplasmic
11
scFV
-
n
anobody
-
VP64 fusion protein
. sgRNA
s
guide this
complex to the
enhancer
region of the
12
gene of interest and facilitate target
-
specific gene activation and expression
(Tanenbaum et al.,
13
2014b)
.
14
To evaluate the performance and feasibility
of CRISPRa mediated transcriptional activation of
15
cell surface proteins for a receptor
/
ligand
interaction
discovery platform, we first selected 10 well
-
16
characterized
cell surface receptors with varying
mRNA
expression levels ranging from not
17
detected to h
igh
ly expressed in
K562 human myeloid leukemia cells
(Figure
S1A
)
(Thul et al.,
18
2017; Uhlén et al., 2019)
. We then generated a pooled lentiviral mini
-
library of 10 sgRNAs
per
19
enhancer
(
100 sgRNA elements
)
, matched with 100 control sgRNAs derived from scrambled
20
sequences
(Gilbert et al., 2014a)
and transduced K562 cells stably expressing the sunC
as
9
21
system
(Figure S1A)
.
Each library
plasmid
contained
a
single sgRNA targeting
one of the 10
22
gene
s
, a GFP flu
orescent marker and
a
puromycin resistance
marker
. The library transduced
23
cells were puromycin selected for 5 days to obtain >90% GFP positive cells. The
expression levels
24
of the
10 cell surface receptors
were
then evaluated by cell surface staining using
APC
25
(Allophycocyanin)
labelled antibodies against the respective target
s (CD122 was used as a
26
control)
. All 10 selected targets showed
elevated
cell surface expression in comparison to non
-
27
transduced K562 sunCas9 cells
o
r
a control receptor (CD122)
(Figur
e S1B
)
.
28
We then used human interleukin 2 (IL
-
2)
, which has a high affinity receptor subunit termed
29
CD25,
to validate our
screening
workflow (Figure 1E) in two parallel screens, simulating two
30
library sizes by diluting the 10 target (100 sgRNA) K562 sunCas9
mini
-
library
by
1:20 and 1:200
31
with non
-
transduced K562 sunCas9 cells
,
corresponding to final library sizes
that would
32
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint
correspond
to screening of
200 and 2000 targets
,
respectively (Figure
S
1
C
,
D
). Both library pools
1
were incubated with magnetic streptavidin microbeads complexed with biotinylated IL
-
2
, and IL
-
2
2 binding cells
were
isolated
in a positive selection workflow by MACS
,
using
Miltenyi LS
-
MACS
3
columns (Figure
S
1
E
). After labelling, washing and elution, positively selected cells were
4
expanded and
stained with
IL
-
2 tetramer
s
(Figure
S1
D
). Genomic DNA was extracted from both
5
consecutive rounds of selection as well as the K562 sunCa
s9 ML library itself, followed by
6
barcoding and deep sequencing
for both libraries
.
7
Deep sequencing data for each round of selection was analyzed and hits were identified using
8
the robust casTLE statistical framework
(Morgens et al., 2016b)
. Briefly, ca
sTLE compares each
9
set of gene
-
targeting guides to the negative controls, using both safe
-
targeting and non
-
targeting
10
controls and selecting the most likely maximum effect size (casTLE
-
Effect). A
p
-
v
alue is then
11
generated
,
representing the significance of this maximum effect by permuting the results. We
12
calculated casTLE metrics for each round of selection in comparison to the naïve library. In
13
addition, we used the casTLE metrics to plot trajectories for each hit for the c
onsecutive rounds
14
of selections
,
which allows for direct evaluation of sgRNA enrichment throughout the selection
15
workflow and easy elimination of false positives. Using casTLE, both IL
-
2 CRISPRa screens
16
successfully identified interleukin 2 (IL
-
2) receptor
alpha (
IL2RA
;
CD25
) as the top hit with the
17
highest confidence (casTLE Score), casTLE Effects and significance (
p
-
v
alue)
(
F
i
g
u
r
e
S
1
E
)
.
Side
18
by side comparison of enrichment scores for both rounds of selections from both libraries was
19
plotted as bar graphs (Figure S1
F
)
.
20
21
Customized, pooled CRISPRa
cell surface receptor
library design
.
22
23
Having established
the
screening
workflow (Figure 1E)
,
we sought to leverage the power and
24
efficiency of customized, pooled CRISPRa
cell surface
libraries to perform targeted screens with
25
secreted ligands. We first compiled a comprehensive list of cell surface receptors by carefully
26
curating the human membran
e proteome (Figure
1A
, C
). We choose
a
targeted
cell surface
library
27
approach instead of a genome
-
wide approach
, because it allowed
for a smaller library size
,
28
resulting in a better signal to noise ratio (SNR)
and
avoiding unwanted transcriptional upregu
lation
29
of non
-
membrane proteins. We utilized several databases including HUGO, UniProt, the Human
30
Protein Atlas and bioinformatic tools (SignalP, TMHMM) to compile two cell surface target lists
31
covering both single transmembrane (TM1) and multi
-
span transm
embrane (TM2+) cell surface
32
proteins
(Figure 1C)
. For the CRISPRa mediated transcriptional activation of cell surface proteins,
33
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint
we synthesized and cloned two pooled sgRNA libraries, a TM1
and a TM2+
library
, each
with 10
1
sgRNA
s
per target
(Gilbert et al., 2
014b)
. Both libraries include matched controls targeting
2
genomic locations without annotated function
(Figure
s
1C, D
). K562 cells stably expressing the
3
sunC
as
9 system were infected with both libraries (TM1; T
M2+) at low, medium and high
4
multiplicity of infection (MOI), then
selected with puromycin until the
cell
population was at least
5
90% GFP
-
positive
,
indicating
the
presence of lentivirus. Cells were recovered
and
expanded
,
and
6
representative aliquots were saved as naïve library stocks in liquid nitrogen with at least 1000x
7
cell number coverage per sgRNA to maintain maximum library complexity. Sufficient sgRNA
8
representation of the naïve library was confirmed by deep sequenc
ing after selection and showed
9
the highest coverage and diversity at low MOI with at least 91% of reads with at least one reported
10
alignment (R=0.97) for both libraries (Figure
S
1G
,
H
).
L
i
b
r
a
r
y
i
n
f
o
r
m
a
t
i
o
n
i
n
c
l
u
d
i
n
g
s
g
R
N
A
t
a
r
g
e
t
11
I
D
s
a
n
d
s
e
q
u
e
n
c
e
s
f
o
r
b
o
t
h
l
i
b
r
a
r
i
e
s
(
T
M
1
;
T
M
2
+
)
c
a
n
b
e
f
o
u
n
d
i
n
D
a
t
a
S
3
.
12
13
CRISPRa benchmark screen using human interleukin 2 (IL
-
2).
14
15
After lib
rary cloning and validation
,
we benchmarked the sensitivity and robustness of our
16
screening platform with a proof of concept screen using human interleukin 2 (IL
-
2)
, following the
17
protocols used for the
mini
-
library
(Figure S1)
. We successfully recovered C
D25 (IL2RA) as the
18
top hit after two rounds of enrichment, deep sequencing and analysis following the outlined
19
workflow (Figure
1E
).
Initially, the naïve TM1 library showed no positive IL
-
2 binding by tetramer
20
staining (Figure
1F
). Library enrichment was m
onitored by IL
-
2 tetramer staining throughout the
21
selection workflow and only after one round of positive selection we observed a significant
22
enrichment of IL
-
2 selected cells from 0.1% to 13.3% IL
-
2 tetramer positive cells (Figure
1F
). After
23
expanding the
cells from the first
r
ound and subjecting them to a second round of CRISPRa
24
enrichment screening
we observed a further robust increase of IL
-
2 tetramer positive from 13.3
25
to 95% IL
-
2 tetramer positive cells (Figure
1F
).
26
After each consecutive round of
sel
ection
, enriched cells were expanded and genomic DNA
27
was extracted, followed by barcoding and deep sequencing. Genomic DNA from the K562
28
sunCas9 TM1
naïve
library itself
served
as the baseline. Following deep sequencing, data from
29
both rounds of consecutive IL
-
2 selections was analyzed and visualized using the casTLE
30
statistical framework.
CD25
was identified as the top hit with the highest confidence (casTLE
31
Score),
p
-
v
alue (si
gnificance), and casTLE Effect (Figure
1G
, H
). Furthermore, we used the
32
casTLE
m
e
t
r
i
c
s
from each round to plot trajectories
of CD25
, which allows for a direct evaluation
33
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint
of sgRNA enrichment throughout the selection workflow and shows a positive trajectory for
CD25
1
in the selection workflow (Figure
1
I
), validating the sensitivity and robustness of our
screening
2
pipeline
.
3
4
Selection and production of secreted proteins for CRISPRa screening
.
5
6
We
first
generated a
secreted proteome master list from
several databa
ses
,
including HUGO,
7
UniProt, the Human Protein Atlas
(Uhlén et al., 2019)
and bioinformatic tools (SignalP, TMHMM)
8
to
identify potential high priority secreted proteins for our screening workflow.
After curation of the
9
human secreted proteome (Figure
1
A
, B
),
approximately 60%
of
the ~1600 genes
were classified
10
as
encoding
enzymes (mostly proteases), enzyme inhibitors, se
rum proteins or components of
11
saliva, tears, or other fluids (these include carrier proteins), structural, extracellular matrix
12
proteins, antimicrobial, complement factors, coagulation factors, lectins, or unknown. The
13
remaining ~40%
of genes
were identifi
ed as likely to
encode
secreted ligands act
ing
through cell
14
surface receptors and further examined through literature searches
.
We classified products of
15
419
genes
as
ligands with known receptors that can
adequately
account for their biology.
Finally,
16
w
e i
dentified
206
gene products
either
as
“
orphans
”
with no identified receptor
or
as ligands
that
17
are
likely to have additional, as yet unidentified receptors in addition to those that have been
18
described.
From these
206,
we ultimately selected a total of 8
0
high priority targets
(one per gene;
19
we did not consider isoforms generated through alternative splicing)
with a broad coverage of
20
molecular function
,
tissue expression
, domain architecture
,
and disease association (DisGeNet)
21
(Figure
S
2
A
,
B
)
.
Coding sequen
ces for
these
8
0
secreted proteins of interest were synthesized
,
22
subcloned in
to an
Avi
-
6xHIS expression plasmid
,
expressed in
E
xpi293F cells
, purified with Ni
-
23
NTA resin, then
biotinylated
in vitro
and further purified by size
-
exclusion
chromatography (SEC
)
24
(Figure
S
8
A
)
.
25
26
CRISRPa enrichment screens
reveal
new secreted ligand
-
receptor interactions
.
27
28
We obtained sufficiently high expression levels for 20 of the 80 high priority targets (Figure 2A
,
29
S
8
A
) to allow screening
u
s
i
n
g
o
u
r
enrichment
w
o
r
k
f
l
o
w
(Figure 1E)
.
Names
and mRNA
30
expression patterns
in normal tissue
for
these 20 ligands are shown in Figure
s
2
A
.
Each of the
31
20
was
used to
scree
n
the
TM1 and TM2+
libraries
with up to 3 consecutive rounds of selection
,
32
followed by deep sequencing and
statistical
analysis using the casTLE framework. Screening
33
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint
results
of
the final round of enrichment for each of the 2
0
secreted proteins were filtered using the
1
following cut
-
offs: casTLE
-
Effect > 2, casTLE
-
Score > 2, pValu
e < 0.05. To predict high
-
2
confidence interaction pairs from each dataset
,
a custom score was computed for each potential
3
interaction pair by combining three metrics into
one
ESP score: (casTLE
-
Effect + casTLE
4
Score)/p
-
Value.
To
integrate
data analysis and
visualization,
we used the
c
o
m
b
i
n
e
d
ESP score
5
to rank sort interaction pairs for every screen
.
6
We selected a subset of CRISPRa enrichment screens with high ranking
predicted ESP scores
7
for potential interaction pairs for further validation using orthogonal methods
,
including surface
8
plasmon resonance (SPR) and cell surface staining (CSS)
.
Cell surface staining was utilized as
9
an orthogonal validation method to show
PPIs in a cellular context using a fluorescent
-
10
tetramerization
-
based approach by flow cytometry for high sensitivity detection of putative PPI’s
11
on the cell surface.
In total, we tested 2
2
candidate PPIs between the secreted and membrane
12
proteome by SPR an
d/or CSS from 1
2
s
creens with PPIs in both the TM1 and the TM2+ library
13
(Table 1).
These validation data are shown for selected
PPIs
in Figures
3
-
6
and S
6
-
S7
.
14
In a first pass analysis we selected the validated hits of the CRISPRa enrichment screens and
15
pe
rformed database searches to calculate overlaps between our screening results and the
16
aggregate of Bio
G
rid, BioPlex, and STRING databases
(physical interactions; Membrane and
17
secreted proteome)
.
We observed no overlap between any of these databases and the
hits
18
reported in this screen
(
F
i
g
u
r
e
2
B
)
. As
we previously reported for interactome screens of
19
Drosophila and human cell surface proteins
(Özkan et al., 2013a; Wojtowicz et al., 2020)
,
high
-
20
throughput PPI analysis methods such as
Y2H and AP/MS
generate mostly
false positive
21
interactions for secreted and membrane proteins and are unable to identify genuine interactions
22
found through ELISA and
/or
cell
-
based screening methods.
23
24
Oligodendrocyte
-
myelin glycoprotein
(OMG)
binds to multiple receptor tyrosine
25
phospha
tases
.
26
27
Protein tyrosine phosphorylation is a fundamental regulatory step
in
intracellular signal
28
transduction and is orchestrated in a coordinated fashion
by activities of protein tyrosine kinases
29
(PTKs) and phosphatases (PTPs).
PTPs play
essential
roles
in the regulation
of growth,
30
differentiation, oncogenic transformation
, and other processes
(Julien et al., 2010)
.
The classical
31
PTPs include
cytoplasmic PTPs and
transmembrane receptor protein tyrosine phosphatase
s
32
(RPTP
s
)
,
which
can be classified in
to
distinct subfamilies according to their domain architecture.
33
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint
Most
RPTPs display features of cell
-
adhesion molecules
(CAMs)
with a domain
repertoire
1
including
MAM
(meprin, A
-
5
) domains
, Ig
(immunoglobulin
-
like)
domains, and FN
(Fibro
nectin)
2
Type
III repeats
in their extracellular segment
s
(Figure 3M)
(Tonks, 2006)
.
In our
human
in vitro
3
interactome screen, we identified new cell surface binding partners
for
multiple
RPTPs
tha
t
are
4
likely to
mediate
cell
-
cell and
/or
cell
–
matrix
interactions
(Wojtowicz et al., 2020)
.
5
In our
CRISPRa
screen
with OMG,
we observed
the
Type
R2B
subfamily member
PTPRU as
6
the top
-
ranking hit
(Figure
3A
)
with a
positive
enrichment trajectory over
all 3 rounds of selection
7
(Figure 3B)
. We
confirmed binding of
OMG to PTPRU by
SPR,
with a K
D
of
~
2
μM
(Figure 3
C
)
.
8
We also
identified
two
members of the R2A subfamily, PTPRF and PTPRS
,
as well as the R4
9
sub
family member PTPRA
as enriched in the OMG screen (Figure
3
D
)
.
Type R2A (PTPRD,
10
PTPRF, PTPRS), R2B (PTPRK, PTPRM, PTPRT, PTPRU) and R3 (PTPRB, PTPRH, PTPRJ,
11
PTPRO, PTPRP) are the largest
RPTP
subfamilies
. They all have
large
ECDs
that include FN
-
III
12
repeats. R2A RPTPs also have Ig domains, and R2B RPTPs have both Ig
and MAM domains
13
(Figure
3M
). PPIs often occur between phylogenetically related proteins both within and between
14
subfamilies.
We examined binding of OMG to all R2A and R2B subfamily members
as well as
15
PRPRJ (R3)
by SPR. Binding in the micromolar affinity
ra
nge was observed for all three R2A
16
RPTPs
(
PTPRD, PTPRF
,
PTPRS)
but only for PTPRU among R2B RPTPs
(Figure
3E;
S3A)
.
17
Hierarchical clustering by healthy tissue expression correlations may infer functionally related
18
communities. We therefore
examined
healthy
tissue mRNA expression profiles for OMG, R2A,
19
R2B and R3
RPTP
family members from the Human Protein Atlas
(Karlsson et al., 2021)
and
20
performed a multivariate clustering analysis.
OMG clustered with
several
RPTP family members
21
including binding
partners
PTPRU and PTPRD
(Figure
3F
)
.
In the nervous system, these RPTPs
22
are expressed
primarily
in neurons, and could function as receptors for OMG, which is expressed
23
in oligodendrocytes and some neurons
(Figure S3B)
.
24
25
PTPRU
binds to
Osteocrin
, a primat
e
-
specific brain ligand
.
26
27
PTPRU
was
also
identified
as
a potential hit
in
a
screen for Osteocrin (OSTN)
.
Although o
ur
28
initial ESP ranking showed PTPRJ, a RPTP member of the R3 subfamily, as the top ranking hit
29
for OSTN (Figure
3G
)
,
analyzing
the enrichment trajectories over all 3 rounds of selections
30
revealed
that PTPRJ
actually
followed a negative trajectory (Figure
3
H
).
By contrast, t
h
e R2B
31
family member PTPRU showed
a
significant
positive enrichment trajectory over the course of the
32
screening workflow
compared to PTPRJ (Figure 3H)
.
We therefore
analyzed
binding of OSTN to
33
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint
a
panel of R2A, R2B and R3
RPTP
members and f
ound that OSTN exclusively bound to PTPR
U,
1
with
a
K
D
of
~
46 nM
(Figure
3
I
,
J
, S
3C
)
.
2
OSTN (musclin
) is a 130 aa peptide hormone that was originally identified in mouse bone and
3
muscle. It regulates bone growth, supports physical endurance and mediate
s
diverse cardiac
4
benefits of physical activity
(Subbotina et al., 2015)
.
These actions could be media
ted through
5
OSTN
’s
binding to the natriuretic peptide clearance receptor (NPR
-
C)
(Moffatt et al., 2007)
. By
6
binding to NPR
-
C, OSTN decreases clearance of natriuretic peptides and thereby increases
7
signaling through the NPR
-
A and NPR
-
B receptors
.
In prima
tes, however, the OSTN gene has
8
acquired neuron
-
specific regulatory elements, and primate OSTN is expressed in cortical neurons
9
and is induced by depolarization in in vitro cultures and by sensory stimuli in vivo. OSTN restricts
10
dendritic growth after depo
larization. OSTN expression peaks during the onset of synaptogenesis
11
in fetal development, but it continues to be expressed in neocortex in adults
(Ataman et al., 2016).
12
A
p
airwise correlation of
normal
human
tissue mRNA expression
data
for
OSTN
and the
R2A,
13
R2B and R3
RPTP
subfamilies
showed correlation
only
with PTPRU
and
its close relative
PTPRT
14
(Figure
3K
)
.
A
h
ierarchical cluster analysis
of
human
tissue mRNA expression data shows a
15
strong correlation of OST
N
and PTPRU in
brain and
skeletal muscle
(Figure
3L
; Cluster
3
)
.
16
17
The
Growth Arrest Specific 1 (GAS1)
protein binds to
PTPRA
.
18
19
PTPRA was identified as the highest ranking hit in the GAS1
s
creen
,
with the highest ESP
20
score and a consistent positive enrichment over three rounds of positive selection
(Figure 4A
,
B),
21
followed by
lower scoring
PTPRU
a
n
d
P
T
P
R
J
w
i
t
h
l
o
w
e
n
r
i
c
h
m
e
n
t
(Figure
4
C
).
PTPRA is a
22
member of the R4
RPTP
subfamily (Figure 3M), which
have
short, highly glycosyla
ted ECD
s.
23
GAS1 bound exclusively to PTPRA, with a K
D
of ~1 μM (Figure 4
D
).
We also showed that
24
tetramerized GAS1
(GAS1:SA647)
exhibits increased binding to K562 cells that overexpress
25
PTPR
A, demonstrating that GAS1 is a soluble ligand for
cell
-
surface PTPRA (Figure 4
E
)
.
No
26
binding was observed to PTPRU or PTPRJ (Figure S4A) and multivariate clustering showed that
27
GAS1 is clustering more closely to PTPRA than to PTPRU or PTPRJ (Figure 4F).
28
PTPRA is ubiquitously expressed, while GAS1 has a m
ore restricted expression pattern (Figure
29
4
G
).
GAS1
and PTPRA are both involved in
RET tyrosine kinase signaling, as well as in other
30
signaling pathways. GAS1
is related to the GFR1 family of transmembrane proteins, which are
31
coreceptors for the RET recept
or tyrosine kinase
(RTK)
. RET
-
GFR1 complexes bind to glial
-
32
derived neurotrophic factor (GDNF), leading to RET autophosphorylation and activation of
33
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint
downstream Akt and MAPK signaling pathways. GAS1 interacts directly with RET and recruits it
1
to lipid rafts.
GAS1 binding causes a reduction in GDNF
-
induced Akt phosphorylation, suggesting
2
that it is a negative regulator of RET signaling.
PTPRA also associates with RET signaling
3
complexes and can directly dephosphorylate RET, causing inhibition of RET signaling.
PTPRA
4
has not been demonstrated to directly bind to RET, however, and a linkage between PTPRA and
5
RET might be provided by GAS1.
6
We also examined data from the
Cancer Genome Atlas (TCGA; (
http://www.cbioportal.o
rg/
)
,
7
and found that in most tumor types high expression of GAS1 is correlated with negative outcomes,
8
while PTPRA expression is associated with favorable outcomes
(Figure
S
4
B
)
.
9
10
TAFA
-
2
s
electively
i
nteracts with
inhibitory
Killer
-
Cell
Immunoglobulin
-
like Receptors
11
(KIR
s
)
.
12
13
A
C
RISPRa enrichment
screen with
TAFA
-
2 (FAM19A2; chemokine like family member 2)
14
identified two inhibitory killer immunoglobulin
-
like receptors (KIRs), KIR3DL
1
and KIR3DL
3
, which
15
are selectively expressed on
natural killer (NK) cells
,
as the highest ranking hit
s
by ESP scoring
16
(Figure
5A
)
.
KIRs are a polymorphic subfamily of MHC class I receptors
(Li and Mariuzza, 2014;
17
Pende et al.
, 2019; Sivori et al., 2019)
. KIR3s have D0, D1, and D2 Ig
-
like
domains
.
KIR2
s have
18
only D1 and D2 domain
s
, except
for KIR2DL5, which has a D0 and D2 but lacks D1
(Figure 5C).
19
The structure of KIR3DL1 complexed to
an
HLA
-
B
reveals that the helices and bound peptide of
20
the HLA engage with the D1 and D2 domains of the KIR, while the
D0 domain extends down
21
toward the b
-
2
-
microglobulin subunit and engages sequences that are
highly conserved
among
22
all HLA A and B alleles
(Li and Mariuzza, 2014)
.
23
We
observed
binding of
TAFA
-
2 to
KIR3DL1
to by SPR, with a K
D
of ~33 μM
(Figure 5B
, S5A
).
24
Binding to KIR3DL3 was at the limit of detection and did no
t
saturate (
Figure S5
B
).
To examine
25
binding at the cell surface, fluorescent
TAFA
-
2
tetramers
(
T
A
F
A
2
:
S
A
6
4
7
)
were incubated with
26
NKL cells expressing either full
-
length
KIR3DL1 or
KIR2DL1. Flow
cytometry analysis revealed
27
concentration
-
dependent binding of TAFA
-
2 to cells expressing KIR3DL1, but not
to those
28
expressing KIR2DL1
(Figure 5D).
To further define KIR binding specificity, we tested binding of
29
TAFA
-
2 tetramer
s
(TAFA2:SA647)
to K562 cell
s expressing KIR3DL1, KIR3DL2, KIR3DL3,
30
KIR2DL2 or KIR2DL5A. We
observed
higher
concentration
-
dependent
binding of TAFA
-
2 to
cells
31
expressing
the KIR3s or KIR2DL5A, which all have D0 domains,
compared to
to KIR2DL2, which
32
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint
does not
have a D0 domain
(Figure
5
E,
F). These data suggest that D0 domains are required for
1
TAFA
-
2 binding.
2
TAFA
-
2
is a member of a highly conserved 5
-
gene family (TAFA1
-
5)
of
chemokine
-
like peptides
3
(neurokines) expressed in the brain. Like other chemokines, TAFAs 1, 4, and 5
bind to G protein
-
4
coupled receptors (GPCRs). TAFAs 1
-
4 all complex with neurexins during their passage through
5
the ER/Golgi pathway, leading to formation of disulfide
-
bonded cell
-
surface neurexin
-
TAFA
6
complexes
(Khalaj et al., 2020; Sarver et al., 2021; Tom Tang et al., 2004)
.
Although TAFA
-
2 has
7
only been examined in the brain, the gene is also expressed in the immune system. Its expression
8
is restricted to naïve and memory regulatory T cells (T
-
regs), basophils, and neutrophils, with the
9
highest express
ion levels being observed in basophils
(Figure
5G
)
.
Neurexins are not expressed
10
in these cell types, so TAFA
-
2 may be secreted as a monomer or complexed to another protein.
11
The observed interactions of TAFA
-
2 with D0 domains of KIRs suggest that
expression
of the
12
chemokine
by regulatory T cells or basophils
might modulate KIR signaling in
NK cells in
response
13
to binding of HLA
on target cells
.
14
15
The S
cavenger Receptor CD36
acts as
a
receptor for a broad range of secreted ligands.
16
17
CD36, also known as
SCARB3
or
glycoprotein 4 (GPIV), is a
multifunctional
Type B scavenger
18
receptor with two transmembrane domains
and
a
~
410 aa
spanning
extracellular domain
.
CD36
19
is k
n
own to bind to man
y
ligands
(Silverstein and Febbraio, 2009)
.
In our
analysis of TM2+ library
20
screens we identified CD36 as
a
top hit
in
several screens: KRTDAP, LY6H, NRN1, NRN1L,
21
VWC2L and SCRG1
(Figure
6
A)
.
To examine
potential
binding of these secreted ligands to CD36
22
on the cell surface, fluorescent tetramers of LY6H, NRN1, VW2CL and SCRG1 were incubated
23
with 293F cells or
293F cells expressing full
-
length CD36
(Figure
6
B
)
.
We observed binding of
24
LY6H, NRN1, VW2CL and SCRG1 to
CD36 by FACS.
SPRC
,
which showed no enrichment of
25
CD36
in
i
t
s
screen
,
served as a control and
did not bind
to CD36
(Figure
6
B, C
)
.
All
CD36
enriched
26
secreted ligands
(LY6H, NRN1, NRN1L, VWC2L, SCRG1)
except for
KRTDAP show a strong
27
correlation and cluster i
n brain tissue
,
which expresses
CD36
mRNA only at low levels
(Figure
28
6
D
).
However
, CD36
protein
is expressed in
brain
microglia, and CD36
-
mediated debris uptake
29
regulates brain inflammation in neurodegenerative disease models
(Dobri et al., 2021; Grajchen
30
et al., 2020)
.
31
32
33
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted June 25, 2022.
;
https://doi.org/10.1101/2022.06.22.497261
doi:
bioRxiv preprint