of 13
RESEARCH
Open Access
© The Author(s) 2023.
Open Access
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use,
sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and
the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this
article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included
in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will
need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/
. The
Creative Commons Public Domain Dedication waiver (
http://creativecommons.org/publicdomain/zero/1.0/
) applies to the data made available
in this article, unless otherwise stated in a credit line to the data.
Guna
et al.
BMC Genomics
(2023) 24:651
https://doi.org/10.1186/s12864-023-09754-y
Introduction
In higher eukaryotes, complex phenotypes are facilitated
not only by genetic expansion, but by the combinatorial
effects of genes working in concert [
1
,
2
]. Evolutionarily,
this complexity affords both genetic redundancy and
the ability to undergo rapid cellular adaptation, which
ensures phenotypic robustness upon loss or mutation
of any particular gene. Indeed, most fundamental pro
-
cesses are buffered by components with partially over
-
lapping function including protein quality control (i.e.
protein folding chaperones and E3 ubiquitin ligases), cel
-
lular stress response (i.e. the heat shock response and the
ubiquitin-proteasome system), and protein biogenesis
(i.e. targeting and insertion to the endoplasmic reticulum
[ER]) [
3
7
]. However, this creates technical challenges to
genetically interrogating biological pathways and assign
-
ing gene function in mammalian cells. For example,
only
~
1/4 of the
~
10,000 genes expressed in a typical
cell will result in any detectable growth phenotype when
depleted [
8
11
].
BMC Genomics
Alina Guna and Katharine R. Page contributed equally to this work.
*Correspondence:
Rebecca M. Voorhees
voorhees@caltech.edu
1
Division of Biology and Biological Engineering, California Institute of
Technology, 1200 E. California Ave, Pasadena, CA 91125, USA
2
Whitehead Institute for Biomedical Research, Massachusetts Institute of
Technology, Cambridge, MA 02142, USA
3
Medical Scientist Training Program, University of California, San Francisco,
San Francisco, CA 94158, USA
4
Howard Hughes Medical Institute, Massachusetts Institute of
Technology, Cambridge, MA 02142, USA
5
Department of Biology, Massachusetts Institute of Technology,
Cambridge, MA 02142, USA
6
David H. Koch Institute for Integrative Cancer Research, Massachusetts
Institute of Technology, Cambridge, MA 02142, USA
7
Howard Hughes Medical Institute Freeman Hrabowski Scholar, California
Institute of Technology, Pasadena, CA 91125, USA
Abstract
Mapping genetic interactions is essential for determining gene function and defining novel biological pathways.
We report a simple to use CRISPR interference (CRISPRi) based platform, compatible with Fluorescence Activated
Cell Sorting (FACS)-based reporter screens, to query epistatic relationships at scale. This is enabled by a flexible
dual-sgRNA library design that allows for the simultaneous delivery and selection of a fixed sgRNA and a second
randomized guide, comprised of a genome-wide library, with a single transduction. We use this approach to
identify epistatic relationships for a defined biological pathway, showing both increased sensitivity and specificity
than traditional growth screening approaches.
Keywords
CRISPR interference, Genetic modifier, Epistasis, Genome-wide screen, ER membrane protein complex,
Tail-anchored proteins
A dual sgRNA library design to probe genetic
modifiers using genome-wide CRISPRi screens
Alina Guna
1,2†
, Katharine R. Page
1†
, Joseph M. Replogle
2,3,4
, Theodore K. Esantsi
2,4
, Maxine L. Wang
1,2
,
Jonathan S. Weissman
2,4,5,6
and Rebecca M. Voorhees
1,7*
Page 2 of 13
Guna
et al.
BMC Genomics
(2023) 24:651
To address these challenges, genetic modifier screens
have traditionally been a powerful tool for defining gene
function, identifying missing components of known path
-
ways, establishing disease mechanisms, and pinpoint
-
ing new drug targets [
12
20
]. Forward genetic modifier
screens rely on genetic ‘anchor points’ as a baseline for
determining whether subsequent mutations, generally
induced through random mutagenesis, result in buffer
-
ing or synthetic phenotypes. In practice, this ‘anchor’ is
established in a model organism or cell, often requiring
extensive manipulation to generate a specific
knockout
in either organisms or cells, or isogenic mutant cell lines
[
21
23
]. Apart from being technically cumbersome, clas
-
sic forward approaches lack the ability to systematically
assess genetic interactions on a genome-wide scale. The
advent of CRISPR-based techniques has expanded this
ability by allowing for (i) the generation of specific genetic
perturbations in the form of knockouts or
knockdowns
and (ii) the performance of unbiased genome-wide for
-
ward genetic screens to identify the genetic basis of an
observed phenotype.
The majority of genetic modifier screens in human
cells leverage a CRISPR cutting based approach [
24
28
].
However, Cas9-mediated DNA cutting is toxic to cells
because it activates the DNA damage response, which is
fundamentally problematic for genetic interaction analy
-
sis where multiple genomic sites are targeted [
29
,
30
].
Additionally, cells readily adapt and compensate for loss-
of-function mutations over time, diminishing observed
phenotypes when isogenic knockout cell lines are
required [
31
]. Moreover, relying on a genetic
knockout
approach is often not amenable to the study of essen
-
tial genes. A more acute strategy, CRISPR interference
(CRISPRi), circumvents many of these issues and offers
several advantages, notably the ability to create homog
-
enous, titratable knockdown of genes without generat
-
ing double-stranded DNA breaks [
32
]. CRISPRi relies on
a catalytically dead Cas9 (dCAS9) fused to a repressor
domain, which, when guided by a sgRNA targeted to a
particular promoter, results in the recruitment of endog
-
enous modulators that lead to epigenetic modifications
and subsequently gene knockdown [
33
36
].
We therefore envision that a strategy to query epistatic
relationships acutely and systematically at scale, com
-
patible with the sensitive phenotypic read-out afforded
by a fluorescent reporter, would be a powerful tool for
assigning genetic function. Towards this goal, we coupled
existing CRISPRi technology with a simple and flexible
dual-sgRNA library design that is compatible with multi-
color FACS-based reporter screens. Our library design,
which acutely delivers both a genetic ‘anchor point’ guide
and a second randomized guide in a single plasmid,
allows us to perform genetic modifier screens for essen
-
tial and non-essential genes on a genome-wide scale. As a
proof of principle, we applied this approach to dissecting
the complex parallel pathways that mediate tail-anchored
protein insertion into the endoplasmic reticulum (ER).
This approach will be broadly applicable for (i) identify
-
ing functional redundancy, (ii) assigning factors to paral
-
lel or related biological pathways, and (iii) systematically
reveal genetic interactions on a genome-wide scale for a
given biological process.
Results
Dual sgRNA library design and construction
We developed a strategy to construct and deliver a library
containing a fixed pre-determined guide, our genetic
anchor point, with a second randomized CRISPRi guide
from a single lentiviral backbone at scale (Fig.
1
A). The
basis of our second guide is the CRISPRi-v2 library, a
compact, validated 5 sgRNA per gene library targeting
protein-coding genes in the human genome [
35
]. Ease of
use was a primary focus of the library design which we
addressed by (i) ensuring library construction relied on
straightforward and inexpensive restriction enzyme clon
-
ing, (ii) developing a sequencing strategy that serves as
a failsafe to ensure both guides are present, eliminating
potential background, and (iii) designing the library such
that the resulting data could be analyzed using an exist
-
ing computational pipeline.
A necessary requirement of our library design is iden
-
tification of a pre-verified sgRNA that efficiently targets
and depletes your gene of interest. This guide is first
introduced by standard restriction enzyme cloning into
a human U6 (hU6) and constant region 3 protospacer
(CR3), hU6-CR3 cassette using sgRNA DNA oligos that
can be inexpensively synthesized and purchased (Fig.
1
B).
Using complementary restriction enzyme sites, the
resulting hU6-CR3 cassette is ligated into the CRISPRi-
v2 library at scale, resulting in an mU6-CR1-hU6-CR3
guide design [
31
,
37
]. As in the single element CRISPRi-
v2 library, BFP and puromycin resistance genes are con
-
stitutively expressed, acting as fluorescent and selectable
markers to identify guide-containing cells.
Sequencing of the resulting library couples stan
-
dard barcoded 5’ CRISPRi-v2 index primers with a
new reverse primer complementary to the hU6 region,
thereby only amplifying vectors containing the fixed
sgRNA insert. This is important because during library
construction, it is possible to produce a small fraction
(we estimate
<
2%) that lack the fixed guide. Additionally,
because this cloning strategy involves restriction enzyme
digest of the CRISPRi-v2 library, there is loss of a small
number of guides that contain these cut sites (
~
1%, see
Supplementary Table
1
).
Page 3 of 13
Guna
et al.
BMC Genomics
(2023) 24:651
Putative use of this dual sgRNA library for genetic modifier
screening
To test this procedure, we first generated a library with a
verified ‘non-targeting’ sequence as the fixed guide. Com
-
parison with the standard CRISPRi-v2 library shows that
we maintain similar guide coverage across the genome
after accounting for expected loss of the restriction site
containing guides (Figure
S1
A) [
35
]. The resulting sgRNA
library allows for the acute knockdown of two separate
targets without the need for additional selection markers,
which simplifies both growth screens and the more sen
-
sitive fluorescent reporter-based flow cytometry screens.
This design also removes the need to first make a cell line
constitutively expressing a targeting or non-targeting
sgRNA, thereby ensuring both the gene-of-interest and
the genome-wide library are knocked down for the same
period of time, diminishing the possibility of adaptation.
Our library design is therefore compatible with a work
-
flow that permits querying epistatic relationships with a
variety of phenotypic readouts in any cells expressing the
CRISPRi machinery (Fig.
1
C).
To test for genetic interactors at scale, one would
conduct a screen using both the non-targeting library
we have generated (available from Addgene, Library
#197348), and a second library targeting a validated
genetic ‘anchor point’ for your pathway of interest. Com
-
parison of the results of these two screens, in the pres
-
ence or absence of a characterized pathway component,
will uncover and place factors in their respective pathway.
We expect three possibilities. (i) Enhanced phenotypes in
the ‘anchor point’ screen suggest synthetic effects, which
would be indicative of factors in a parallel pathway, or
a ‘synergistic’ effect. (ii) In contrast, diminished pheno
-
types in the anchor point screen would suggest factors in
Fig. 1
Dual-guide library design and construction.
(A)
Schematic of the dual sgRNA vector. Expression of the randomized CRISPRi-v2 sgRNA is driven by
a mU6 promoter and the fixed guide is driven by a hU6 promoter, each flanked by unique guide constant regions (CR). Downstream, the EF1a promoter
drives the expression of the puromycin resistance selectable marker and BFP.
(B)
Cloning a dual genome-wide library is comprised of two steps. First,
a guide of interest is inserted using standard oligo annealing and ligation into a BstXI/BlpI cut backbone. Second, both CRISPRi-v2 library and the fixed
guide are digested with complementary restriction sites (BamHI/NotI) and ligated at scale, resulting in an mU6-‘V2 guide’-hU6-‘fixed guide’ library design.
To sequence the resulting library, a standard 5’ indexed primer is coupled with a reverse primer that anneals to the hU6 region upstream of the inserted
fixed guide. This strategy ensures only guides containing the fixed region are amplified for sequencing.
(C)
A general workflow for using our library design
in any CRISPRi machinery containing cell
Page 4 of 13
Guna
et al.
BMC Genomics
(2023) 24:651
the same pathway. (iii) Finally, factors with phenotypes
independent of our genetic ‘anchor point’ likely represent
orthogonal genes.
Developing a reporter assay to assess tail-anchored (TA)
protein insertion at the endoplasmic reticulum (ER)
As a proof of principle, we tested the utility of our dual
library by interrogating genetic interactors using a bio
-
logical system known to contain at least two partially
redundant pathways: tail-anchored membrane protein
biogenesis. Tail anchored proteins (TAs) carry out essen
-
tial functions including vesicle trafficking, organelle bio
-
genesis, and cell-to-cell communication [
38
]. This family
of integral membrane proteins are characterized by a sin
-
gle transmembrane domain (TMD) within 30–50 amino
acids of their C terminus [
39
]. The proximity of the TMD
to the stop codon necessitates that TAs be targeted and
inserted into the membrane post-translationally. Though
found in all cellular membranes, the majority of TAs
are targeted to the ER using two parallel pathways: the
Guided Entry of Tail-anchored protein (GET) and ER
membrane protein complex (EMC) pathways [
38
,
40
42
].
In mammalian cells, the central components of the
GET system are the targeting factor GET3, and the ER
resident insertase composed of the heterooligomeric
GET1/GET2 complex [
43
,
44
]. The EMC pathway relies
on targeting by the cytosolic chaperone, Calmodulin to
the nine-subunit EMC insertase [
42
]. The dependency
of a particular TA on either set of factors is determined
by the hydrophobicity of its TMD, with more hydropho
-
bic substrates relying on the GET, and less hydrophobic
substrates relying on the EMC [
42
,
45
47
]. However, TAs
of intermediate hydrophobicity can utilize both path
-
ways for targeting and insertion into the ER, potentially
obscuring genetic relationships [
42
]. We therefore rea
-
soned that our dual-guide screening platform would be
ideally suited to identify epistatic relationships between
factors in these two pathways (Fig.
2
A).
To assess TA biogenesis using a FACS-based approach,
we adapted a fluorescent split GFP reporter system to
specifically query insertion into the ER (Fig.
2
B). For our
reporter substrate, we chose SEC61β, which is an ER-
localized TA that normally forms part of the heterotri
-
meric Sec61 translocation channel (along with Sec61α,
and γ) [
2
,
48
50
]. SEC61β contains a TMD of inter
-
mediate hydrophobicity and is known to use both the
EMC and GET pathways for biogenesis [
42
]. We consti
-
tutively expressed the first 10 β-strands of GFP (GFP1-
10) in the ER lumen and appended the 11th β-strand
onto the C-terminal of the endogenous sequence of
SEC61β (SEC61β-GFP11) [
51
,
52
]. Successful insertion
of SEC61β into the ER membrane would therefore result
in complementation (GFP11
+
GFP1-10) and GFP fluo
-
rescence. To generate cell lines compatible for screening,
we engineered K562 cells to stably express ER GFP1-10
and the dCas9-KRAB(Kox1) CRISPRi machinery. Under
an inducible promoter, we integrated the SEC61β-GFP11
reporter alongside a normalization marker (RFP) sepa
-
rated by a viral 2A sequence (Figure
S1
B). Expression of
both the TA and RFP from the same open reading frame
allows us to use the GFP:RFP ratio to identify factors
involved in TA biogenesis while discriminating against
those that have a non-specific effect on protein expres
-
sion levels (i.e. transcription or translation).
Interrogating TA insertion into the ER using dual sgRNA
libraries
To permit screening with our dual-guide library design,
we constructed a library using a previously validated
EMC2 sgRNA as our ‘fixed’ guide (Figure
S1
C; Addgene
Library #197349). EMC2 is a core, soluble subunit of the
EMC, whose depletion leads to the post-translational
degradation of the entire EMC via the ubiquitin-protea
-
some system [
53
,
54
]. Therefore, targeting EMC2 is suf
-
ficient to disrupt the EMC pathway for TA insertion, and
serves as our ‘genetic anchor’. Using our reporter cell line,
we confirmed using programmed dual guides that loss of
both the EMC and GET2 resulted in a synergistic effect
on SEC61β insertion (Fig.
2
C). The enhanced effect of
loss of GET2 in an EMC knockdown background vali
-
date the conceptual premise of our dual-guide screening
approach at scale.
We therefore separately used both the EMC2 and a NT
control library to transduce our K562 SEC61β reporter
cell line, isolated cells that had perturbed GFP:RFP ratios
by FACS, and identified the associated guides by deep
sequencing. In parallel for comparison, we conducted a
traditional growth screen with both the NT and EMC2
libraries in uninduced K562 SEC61β-GFP11 reporter cell
lines (Figure
S2
A, Supplementary Table
2
). As expected,
in the NT-FACS screen loss of GET pathway compo
-
nents (GET2, GET3, and GET1) and all EMC subunits
led to decreased SEC61β-GFP fluorescence, consistent
with their established role in TA biogenesis. However,
the EMC2-FACS screen showed markedly different
results indicative of the genetic relationships between
the EMC and GET pathway components (Fig.
3
A). First,
when depleted on top of EMC2, the phenotype effects
of loss of the main GET pathway factors is enhanced
when compared to the NT screen. Second, the majority
of guides targeting EMC subunits no longer have signifi
-
cant effects on SEC61β-GFP, consistent with being in the
same complex, and therefore same pathway, as EMC2
[
53
,
54
]. The exceptions are EMC2 itself, likely because
two guides targeting the same gene leads to a greater
degree of knockdown, and EMC10, which has been sug
-
gested to have a separate regulatory role in TA biogen
-
esis [
55
]. Conversely, in both screens we also identified
Page 5 of 13
Guna
et al.
BMC Genomics
(2023) 24:651
several novel ER-resident factors (RNF185, TMEM259
and FAF2) whose depletion leads to increased stability
of SEC61β-GFP. Presumably, these putative quality con
-
trol factors are responsible for recognizing and degrad
-
ing over-expressed SEC61β from the membrane, but are
agnostic to which biogenesis pathway was initially used
for its insertion.
To facilitate comparison of screens for identification of
genetic interactors, we calculated a discriminant score for
each gene, which integrates the statistical confidence and
phenotype into a single value, as previously described
[
56
]. A similar strategy is routinely used to categorize
statistically significant from non-significant hits when
analyzing genome-wide screens using volcano plots [
35
].
Genes are further ranked by their discriminant scores
and the change in rank between the two screens is cal
-
culated. This allowed us to visualize the effects of a spe
-
cific gene on SEC61β stability in the absence or presence
of EMC2 (Fig.
3
B). Comparison of the NT- and EMC2-
genome-wide FACS screens using the discriminant score
Fig. 2
Querying tail-anchored (TA) protein biogenesis at the endoplasmic reticulum (ER).
(A)
(Left) TA proteins can be inserted into the lipid bilayer
by either the EMC or GET insertases. (Right) TAs containing a moderately hydrophobic transmembrane domain such as SEC61β can use either EMC or
GET1/GET2 to insert, obscuring strong effects on insertion when obstructing only one of these partially redundant pathways. Therefore, use of an EMC2
fixed guide dual library should uncover defined epistatic relationships between factors in either the GET or EMC pathways.
(B)
Schematic of the split
GFP reporter system used to assess insertion of SEC61β into the ER. K562 cells expressing CRISPRi machinery were engineered to constitutively express
GFP1-10 in the ER lumen. The 11th β -strand of GFP is fused to the C-terminus of SEC61β, allowing for conjugation and fluorescence of the full GFP upon
insertion into the ER membrane. RFP is expressed as a normalization marker, separated by a viral P2A sequence.
(C)
Depletion of EMC and GET pathway
components in the SEC61β reporter cell line. The SEC61β cell line was separately transduced with dual guides targeting EMC2 alone, GET2 alone, EMC2
and GET2, or a non-targeting control. The GFP:RFP ratio, a measure of SEC61β insertion at the ER, is plotted for each dual guide
Page 6 of 13
Guna
et al.
BMC Genomics
(2023) 24:651
Fig. 3
Dual-guide CRISPRi screen with SEC61β reveals genetic interactions between GET and EMC pathway components.
(A)
Volcano plot illustrating
the phenotype for the two strongest guide RNAs versus log10 (Mann-Whitney p-values) from two independent replicates of a genome-wide screen with
either non-targeting dual (NT ) or EMC2-dual libraries using the SEC61β-GFP11 reporter. Individual genes are displayed in gray, core factors of the GET
pathway are highlighted in green, EMC subunits are highlighted in black, while putative stabilization factors are in pink.
(B)
A single discriminant score
was computed for each gene in the screens investigating SEC61β-GFP11 stability, representative of the average phenotype score and significance of the
hit in the respective screen. This metric allows direct comparison of both NT-dual and EMC2-dual screens.
(C)
Comparison of genes ranked by discrimi
-
nant score in NT and EMC2-dual screens
Page 7 of 13
Guna
et al.
BMC Genomics
(2023) 24:651
highlighted the three broad categories of factors we
anticipated: members of the GET pathway which show a
synthetic effect with EMC, members of the EMC path
-
way which effectively ‘drop out’ in the absence of EMC2,
and factors which operate orthogonally from both path
-
ways and are therefore unchanged in the two conditions
(Fig.
3
C).
To confirm a subset of the observations predicted
by our reporter-based screens, we conducted arrayed
assays with programmed dual guides. Using our SEC61β-
GFP11 reporter, we show that depletion of both EMC2
and GET3 has an enhanced effect on biogenesis com
-
pared to obstructing either pathway individually. This
effect is likely specific to substrates of intermediate TMD
hydrophobicity, as squalene synthase (SQS), a TA with
known EMC dependency is only affected in the absence
of EMC2 (Fig.
4
A) [
53
]. Additionally, depletion of the
putative quality control components RNF185, TMEM259
or FAF2 have affects the stability of SEC61β (Fig.
4
B),
but not SQS or the GET substrate VAMP (Figure
S3
A).
Indeed, RNF185 and TEMEM259 have been recently
identified as members of a novel arm of ER-associated
degradation (ERAD), while FAF2 has been previously
associated with ERAD [
57
59
].
To illustrate the efficacy of our strategy, we compared
the results of our FACS-based dual-guide library screen
to a more traditional growth screening approach. Using
growth as the metric, there is no increased genetic reli
-
ance on GET pathway components in the absence of
EMC2 (Figure
S2
B, Supplementary Table
3
). This is con
-
sistent with the observation that a substantial number
of genes with transcriptional phenotypes have negligible
growth phenotypes [
37
]. The significant number of hits
both in the presence and absence of EMC2 are essential
genes, occluding the possibility of detecting significant
factors in the context of a particular biological pathway
(Figure
S2
C). Given these results, hits identified from the
growth screening approach would be particularly prone
to off-pathway false positive and false negatives, neces
-
sitating substantial more follow-up to identify bona fide
genetic interactors of the EMC. If we assume no previ
-
ous knowledge of the relationship between the EMC and
GET pathways, the growth-based approach clearly fails
to identify genetic interactions that are crucial to elu
-
cidating its biological function, thereby illustrating the
efficacy and potential utility of our dual-guide screening
approach.
Discussion
We have developed a flexible, straightforward strat
-
egy to rapidly assess genetic interactions genome-wide
with high efficiency. Successful implementation of this
approach does require sufficient prior knowledge of
pathway or candidate gene of interest both to identify the
fixed guide and design and validate an appropriate fluo
-
rescent reporter. However, the dual-guide strategy offers
several practical advantages over existing genetic modi
-
fier screening strategies. Our approach eliminates the
need to create and characterize a knockout line for a par
-
ticular gene of interest [
25
,
29
,
60
62
]. It also allows for
the simultaneous delivery and selection of both targeted
and genome-wide elements, resulting in less cell line
Fig. 4
Validating effects of factors on TA biogenesis.
(A)
Integration of the TA proteins SEC61β -GFP11 or SQS-GFP11 into the ER was assessed in K562
cells that expressed the indicated programmed dual guides. GFP fluorescence is shown relative to a normalization marker (RFP) as determined by flow
cytometry, and the results displayed as a histogram.
(B)
Biogenesis of SEC61β-GFP11 was assessed as in (A) with the presence of guides targeting the
indicated genetic targets