4086–4099
Nucleic Acids Research, 2023, Vol. 51, No. 8
Published online 29 March 2023
https://doi.org/10.1093/nar/gkad191
Specific targeting of plasmids with Argonaute enables
genome editing
Daria Esyunina
1,2,*,
†
, Anastasiia Okhtienko
2,
†
, Anna Olina
2
, Vladimir Panteleev
1,2,3
,
Maria Prostova
1,2
,AlexeiA.Aravin
4
and Andrey Kulbachinskiy
1,2,*
1
Institute of Gene Biology, Russian Academy of Sciences, Moscow 119334, Russia,
2
Institute of Molecular Genetics,
National Research Center “Kurchatov Institute”; Moscow 123182, Russia,
3
Moscow Institute of Physics and
Technology Dolgoprudny, Moscow region, 141700, Russia and
4
Division of Biology and Biological Engineering,
California Institute of Technology, Pasadena, CA 91125, USA
Received August 15, 2022; Revised February 28, 2023; Editorial Decision March 01, 2023; Accepted March 23, 2023
ABSTRACT
Prokaryotic Argonautes (pAgos) are programmable
nucleases involved in cell defense against invading
DNA.
In vitro
, pAgos can bind small single-stranded
guide DNAs to recognize and cleave complemen-
tary DNA.
In vivo
, pAgos preferentially target plas-
mids, phages and multicopy genetic elements. Here,
we show that CbAgo nuclease from
Clostridium bu-
tyricum
can be used for genomic DNA engineer-
ing in bacteria. We demonstrate that CbAgo loaded
with plasmid-derived guide DNAs can recognize and
cleave homologous chromosomal loci, and define
the minimal length of homology required for this tar-
geting. Cleavage of plasmid DNA at an engineered
site of the I-SceI meganuclease increases guide DNA
loading into CbAgo and enhances processing of ho-
mologous chromosomal loci. Analysis of guide DNA
loading into CbAgo also reveals off-target sites of
I-SceI in the
Escherichia coli
genome, demonstrat-
ing that pAgos can be used for highly sensitive de-
tection of double-stranded breaks in genomic DNA.
Finally, we show that CbAgo-dependent targeting of
genomic loci with plasmid-derived guide DNAs pro-
motes homologous recombination between plasmid
and chromosomal DNA, depending on the catalytic
activity of CbAgo. Specific targeting of plasmids
with Argonautes can be used to integrate plasmid-
encoded sequences into the chromosome thus en-
abling genome editing.
INTRODUCTION
Prokaryotic Argonautes proteins (pAgos) are ancestors of
eukaryotic Argonautes (eAgos), which act as central com-
ponents of eukaryotic RNA interference and are involved in
silencing of foreign elements and gene regulation (
1–6
). In-
triguingly, while eAgos bind small RNA guides to recognize
and silence RNA targets, almost all known pAgos preferen-
tially target DNA (
6–8
), except for a small group of RNA-
targeting pAgos found recently (
9
,
10
). Nucleic acid cleavage
by Argonautes depends on an RNaseH-like nuclease site in
the PIWI domain, with the site of cleavage located between
the 10
th
and 11
th
guide nucleotides (
1–6
), although some
pAgos can also produce products with shifted cleavage sites
(
10
,
11
). Analysis of pAgos from several thermophilic and
mesophilic prokaryotes demonstrated that they can be pro-
grammed with small (
∼
18–20 nt) DNA or RNA guides to
cleave single-stranded DNA with high specificity (
11–20
).
However, their activity toward double-stranded DNA
in
vitro
is limited due to their inability to unwind DNA strands
(
14
,
16
,
19
). Cellular functions of pAgos and their ability to
target double-stranded nucleic acids
in vivo
have remained
largely unknown. Recently, CbAgo from
C. butyricum
was
shown to defend heterologous
Escherichia coli
host from
bacteriophage infection (
21
) and several pAgos, including
CbAgo, were found to preferentially target plasmid DNA
suggesting that they function in cell defense against invaders
(
11
,
18
,
19
,
21
).
In bacterial cells, CbAgo is primarily loaded with small
guide DNAs (smDNAs) of plasmid origin and can use them
to recognize and cleave homologous genomic loci in the pro-
cess of DNA interference (
21
). CbAgo also attacks natu-
rally occurring or engineered double-strand breaks (DSBs)
in cellular DNA (
21
). Preferential targeting of plasmids,
bacteriophages and other multicopy elements by pAgos is
likely explained by their intense replication, high copy num-
bers and high frequency of DSBs, resulting in their effi-
cient processing by a combined action of pAgo and the
cellular helicase-nuclease RecBCD, which plays the cen-
tral role in DNA repair and recombination in
E. coli
(
21
).
*
To whom correspondence should be addressed. Tel: +7 4991960015; Fax: +7 4991960015; Email: es
dar@inbox.ru
Correspondence may also be addressed to Andrey Kulbachinskiy. Email: avkulb@yandex.ru
†
The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors.
C
The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License
(http:
//
creativecommons.org
/
licenses
/
by-nc
/
4.0
/
), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work
is properly cited. For commercial re-use, please contact journals.permissions@oup.com
Nucleic Acids Research, 2023, Vol. 51, No. 8
4087
RecBCD rapidly degrades foreign DNA but promotes re-
pair of genomic DNA through homologous recombination
after recognition of Chi (crossover hot-spot instigator) sites
in the host genome, which switch the RecBCD activity from
DSB processing to RecA loading onto the 3’-terminated
DNA strands (
22
,
23
). The distribution of smDNAs bound
by CbAgo at its target regions in genomic DNA is depen-
dent on Chi sites indicating the involvement of RecBCD in
smDNA generation during DSB processing (
21
).
Here, we sought to elucidate the requirements for ge-
nomic DNA cleavage by CbAgo to reveal whether it can
be used for genome engineering. We used CbAgo to intro-
duce targeted DNA breaks at specific loci of the bacterial
chromosome with plasmid-derived guide smDNAs, defined
the homology requirements for genome cleavage by CbAgo,
and showed that CbAgo can induce highly efficient inte-
gration of plasmid-encoded sequences into the homologous
chromosomal locus.
MATERIALS AND METHODS
Plasmids and strains
Plasmids and strains used in this study are listed in Supple-
mentary Information Tables S4 and S5.
E. coli
cultures were
cultivated in LB Miller broth (2% tryptone, 0.5% yeast ex-
tract, 1% NaCl) with the addition of appropriate antibiotics
(ampicillin, Amp, 200
g
/
ml; spectinomycin, Sp, 50
g
/
ml;
chloramphenicol, Cm, 25
g
/
ml), 1%
D
-glucose (Glc), 0.1%
L
-arabinose (Ara) and 200 ng
/
ml anhydrotetracylne, when
indicated; 1.8% agar was added for plate preparation. NEB
Turbo strain was used for cloning, MG1655 and DE160
strains were used for most experiments (Supplementary Ta-
ble S5). The DE160 strain was obtained after transfer of
the Z1 cassette with an inserted CbAgo gene under con-
trol of the tetP promoter from the donor strain DE157 (
21
)
to recipient strain DL2988 containing I-SceI under control
of the araBAD promoter, using P1 transduction. MG1655
recB
-minus strain was obtained by Red-assisted recombina-
tion with PCR products, followed by removal of the antibi-
otic selection marker using the pKD46 and pCP20 plasmids
(
24
).
To obtain pBAD
CbAgo
lacI plasmids containing
lacI
fragments of different lengths, the expression vector
pBAD
CbAgo (
21
) was digested with SphI in the presence
of Shrimp Alkaline Phosphatase (rSAP, NEB). LacI frag-
ments were PCR-amplified from the pET28 plasmid using
primers containing SphI cut sites on both ends, digested
with SphI, gel-purified and cloned into the pBAD
CbAgo
vector using T4 ligase. Colony PCR was used to pick up
clones with insertions in the correct orientation. The cloned
gene contained no promoter region. Recoding of the
lacI
gene was performed manually by making point substitu-
tions without changing the protein sequence (Supplemen-
tary Figure S5). Recoded
lacI
was synthesized as an IDT
gBlock and cloned into the pBAD
CbAgo backbone as
described above. pBAD
yffP
CS
I-SceI plasmids contain-
ing the cut site (CS) of I-SceI were obtained using the
Gibson assembly reaction from PCR products correspond-
ing to the pBAD backbone without genomic DNA ho-
mology regions (
araC
,
araBAD
promoter,
rrn
terminators)
and to the 300 bp
yffP
gene or a larger 1000 bp frag-
ment of the
yffN-yffO-yffP
operon amplified from the ge-
nomic DNA of DE160. The I-SceI cut site was introduced
51 nt upstream of the
yffP
sequence in primers used for
PCR amplification. A mutant variant of the site with two
nucleotide substitutions, which is cleaved less efficiently
(T
TGGGATAACAGGGTAA
A) (
25
),wasusedtoavoid
complete plasmid degradation. Plasmids pBAD
lacI and
pBAD
dYK
CbAgo
lacI (encoding CbAgo with substitu-
tions Y472A, K476A in the MID pocket) were obtained by
PCR with overlapping primers using pBAD
CbAgo
lacI as
a template; the PCR product was introduced into
E. coli
cells by transformation, resulting in plasmid circularization
in vivo
.pBAD
dDD
CbAgo
lacI was obtained by excision
of the catalytically dead CbAgo gene (with substitutions
D541A, D611A) from pBAD
dCbAgo (
21
) using NcoI and
EcoRI and cloning it into pBAD
CbAgo
lacI treated with
the same enzymes. The pDE351 plasmid used in recombi-
nation assays was obtained using the Gibson assembly re-
action with HiFi mastermix (NEB) from 5 PCR products:
(
1
) temperature-sensitive pSC101
ori
from the pKD46 plas-
mid; (
2
)Sp
R
gene from the pSyn6 vector (GeneArt, Ther-
mofisher); (
3
) 4144 bp left homology arm including the
mhpR, mhpA, mhpB, mhpC
genes amplified from genomic
DNA of MG1655; (
4
)the
cat
gene; (
5
) 5987 bp right ho-
mology arm including
cynX, lacA, lacY, lacZ
genes ampli-
fied from genomic DNA of MG1655.
CbAgo expression
E. coli
MG1655 were transformed with plasmids from the
pBAD
CbAgo series with inserted
lacI
fragments and incu-
bated overnight at 37
◦
C on ampicillin LB agar plates. On
the next day, several colonies of each strain were inoculated
into 1 L of LB media supplemented with Ara for CbAgo
induction and Amp and grown for 14–20 h at 18
◦
C until
OD
600
=
1.
E. coli
DE160 were transformed with plasmids
from the pBAD
yffP
CS
I-SceI series, and the cells were
grown in liquid media in the same conditions with the addi-
tion of anhydrotetracylne (200 ng
/
ml) for CbAgo induction
and 0.1% Ara for I-SceI induction. The cells were collected
by centrifugation (6000 g, 6 min at 4
◦
C) and the pellets were
frozen at
−
20
◦
C.
To quantify the level of CbAgo expression, the CFU
(colony forming unit) numbers were measured for the
MG1655 and DE160 transformants at OD
600
=
1, and
aliquots of the same cell cultures were used for western
blotting, using anti-His-tag antibodies (Sigma) and His-
tagged RsAgo as a loading control. The number of protein
molecules per cell was calculated using the following equa-
tion:
N
=
(
n
western
×
N
A
)
/
(DF
×
CFU
×
V
sample
), where
N
is the number of protein molecules per cell,
n
western
is the
amount of CbAgo measured by western blotting (moles, de-
termined from comparison with the RsAgo sample),
N
A
is the Avogadro’s number, DF is dilution factor, CFU is
CFU
/
ml and
V
sample
is the volume (ml) of the sample used
for western blotting.
To test the effects of CbAgo on cell growth,
E. coli
strains
were inoculated directly from frozen aliquots into 1 ml
of fresh LB medium supplemented with Ara (0.01%) and
Amp in the case of MG1655 transformants, or Ara (0.01%)
4088
Nucleic Acids Research, 2023, Vol. 51, No. 8
and anhydrotetracycline (200 ng
/
ml) in the case of DE160,
DE159 and DL2917 strains. The cultures were grown in 24-
well plates at 300 rpm at 30
◦
C in a CLARIOSTAR mi-
croplate reader and cell density was monitored by measur-
ing OD
600
every 30 min. Two independent biological repli-
cates with three technical replicates were performed for each
strain.
Purification of CbAgo-associated smDNAs and preparation
of smDNA libraries
To purify CbAgo-associated smDNAs, cell pellets were de-
frosted at room temperature and resuspended in 30 ml of
lysis buffer (40 mM Tris–HCl pH 7.9, 150 mM NaCl) in the
presence of cOmplete protease inhibitor cocktail EDTA-
free (Roche). After filtration using 170
m nylon filters,
the samples were lysed using a high-pressure homogenizer
(Cell Disruptor PEC) at 30 kpsi twice and then centrifuged
(17 000 rpm, R21A Hitachi rotor, twice for 15 min at 4
◦
C).
Clarified lysate was incubated with 0.3 ml of prewashed
Co
2+
beads (Clontech) for 140 min at +4
◦
C with rotation.
The suspension was centrifuged for 3 min at 500 g at +4
◦
C,
the supernatant was removed and the beads were washed
twice with 25 ml of ice-cold lysis buffer for 5 min and 4
times with 1 ml of lysis buffer containing 10 mM imida-
zole. CbAgo was eluted 3 times with 0.33 ml of lysis buffer
containing 200 mM imidazole. The elution fractions were
treated with Proteinase K (100
g, 1 h 37
◦
C)andusedfor
further smDNA purification.
Deproteinized samples were treated with 0.5 ml of
phenol-chloroform-isoamyl alcohol (25:24:1) pH 7.5–8,
vortexed for 10 s and centrifuged for 2 min at 21 000 g.
The upper aqueous phase was treated twice with 0.5 ml
of chloroform. Nucleic acids from the final aqueous phase
were ethanol-precipitated in the presence of PINK copre-
cipitant and 30 mM sodium acetate pH 5.0 for 1 h in liq-
uid nitrogen or overnight at
−
70
◦
C. The samples were
centrifuged for 30 min, 21 000 g at +4
◦
C and the pellets
were washed with 70% of ice-cold ethanol. Air-dried pel-
lets were dissolved in nuclease-free water (40
l per sample).
An aliquot of each sample (10%) was treated with rSAP
(NEB) in 1
×
T4 polynucleotide kinase (PNK) buffer for 30
min at 37
◦
C, rSAP was inactivated for 10 min at 75
◦
C.
Dephosphorylated samples were radiolabeled with
-P
32
-
ATP using T4 PNK (NEB); an oligonucleotide ladder (12–
70 nt) was also radiolabeled at this point. The labeled sam-
ples were mixed with the rest of corresponding untreated
samples (20
l), resolved by 19% PAGE with 8 M urea in
1
×
TBE, and visualized using a Typhoon FLA 9500 scan-
ner (GE Healthcare). Gel slices containing
∼
14–25 nt smD-
NAs were cut from the gel, crushed in 0.4 M NaCl and incu-
bated overnight at 20
◦
C with constant shaking (1000 rpm)
on a bench tube shaker. Gel slices were removed and nucleic
acids were ethanol-precipitated as described above. DNA
was dissolved in 20
l of nuclease-free water. SmDNAs were
ligated by a bridged-ligation approach as described previ-
ously (
21
), using Illumina-compatible adaptors:
5
-adaptor - 5
-GTTCAGAGTTCTACAGTCCGACGA
TC;
3
-linker -
/
5
P/TGGAATTCTCGGGTGCCAAGGA
ACTC/3
ddC
/
bridge 1 -
/
5
AmMC6/CACCCGAGAATTCCANNN
NNN/3
AmMO
/
bridge 2 -
/
5
AmMC6/NNNNNNGATCGTCGGACT
GTA/3
AmMO
/
.
The samples (20
l each) were mixed with 8
lof5x
Rapid Ligation buffer (ThermoFisher), 2
l of 100
M
5
-adaptor, 2
l of 100
Mof3
-linker, 2
l of 100
M
bridge 1, 2
l of 100
M bridge 2 and 800 units of T4-ligase
(NEB) and incubated for 16 h at room temperature. Lig-
ated DNA fragments were separated by 19% urea PAGE
and eluted from the gel as described above. The libraries
were PCR-amplified with RP1 and indexing primers (True-
Seq) using the NEBNext Ultra II Q5 Master Mix; the num-
ber of cycles was adjusted based on preliminary analysis
of PCR products in 6% native PAGE with SYBR Gold
staining. Amplified libraries were separated by native 6%
PAGE using visualization in blue light, extracted in 0.4 M
NaCl and precipitated as described above. Small DNA li-
braries were sequenced using the HiSeq2500 platform (Il-
lumina) in the rapid run mode (50-nucleotide single-end
reads).
Small DNA analysis
All libraries were quality checked with FastQC (v0.11.9).
Trimmomatik (v0.36) was used to remove adapters, elim-
inate reads shorter than 14 bp and cut reads longer than
24 bp. Reads were aligned onto the reference genome
of
E. coli
(MG1655 Refseq: NC
000913.3; BW25113
Refseq: CP009273.1 with manually added CbAgo gene
(802798–810180) for DE160-based strains) and plas-
mid (pBAD
CbAgo
lacI, pBAD
CbAgo
lacI
recoded or
pBAD
yffP1000
CS
I-SceI) via bowtie (v1.3.1) allowing
zero mismatches and unique alignment (-v 0 -m 1). Poten-
tial multi-mappers that failed to align with the -m option
were realigned using options: -a –best –strata -v 0 -m 10000.
Multi-mappers that were aligned to both genome and plas-
mid sequences (including the
lacI
,
araC
or
yffP
genes) were
not included in further analysis of genomic DNA coverage
to avoid biases in coverage resulting from the presence of
multiple copies of plasmid DNA. The remaining reads were
realigned with the same parameters (bowtie -a –best –strata
-v 0 -m 10000). Uniquely aligned reads and selected multi-
mappers were sorted and combined via samtools (v1.15).
The number of reads for each multi-mapper was divided
by the number of aligned sites. The
E. coli
genome was
divided into 1 kb intervals using a custom Python script.
To obtain read coverage within each interval, the bedtools
(v2.30.0) was used to intersect the resulted bed file with the
bam file. Small DNA coverage was expressed as RPKM
(reads per kilobase per million aligned reads in the library).
Reads from the plus and minus genomic strands were se-
lected with samtools view –F
/
-f 16; the coverage was cal-
culated separately for each strand as described above. To
find Chi sites at a given interval of the genome, a Python
script was used that searches for the Chi sequences (5
-
GCTGGTGG-3
for the plus strand and 5
-CCACCAGC-
3
for the minus strand) within given coordinates. To cal-
culate the fraction of smDNAs generated around the tar-
get genes in the chromosome (
lacI
,
araC
or
yffP
), smDNAs
mapped to the chromosomal region between the second (or
Nucleic Acids Research, 2023, Vol. 51, No. 8
4089
third) closest Chi sites around the locus of interest were di-
vided by the total number of chromosomal smDNAs in the
library.
Analysis of the expression of target loci during DNA interfer-
ence
The same
E. coli
cultures that were used to obtain libraries
of CbAgo-associated smDNAs were used to purify RNA
by GeneJET RNA Purification Kit (ThermoFisher). RNA
was treated with RNAse free DNAse (Qiagen). 2
gof
RNA was used in reverse transcription reaction with Re-
vertAid reverse transcriptase and random hexamer oligonu-
cleotide (ThermoFisher). The resulting cDNA was used in
quantitative PCR (qPCR) reactions with oligonucleotides
specific for the genes of interest (
lacI
,
lacZ
and
araC
)
and the housekeeping gene
gapA
(see Supplementary Fig-
ure S7 for oligonucleotide sequences), using qPCRmix-
HS SYBR premix (Evrogen) in a C1000 Touch Cycler
with CFX96 Optical Reaction Module (Bio-Rad). Oligonu-
cleotides for qPCR (synthetized by Evrogen) were selected
by their specificity and efficiency, validated by Primer-
BLAST and Multiple Primer Analyzer (ThermoFisher),
PCR product melt curve and PCR efficiency evaluation,
performed in serial dilutions experiments. Oligonucleotides
were selected in such a way that they were active only with
genomic cDNA products and could not amplify products
obtained from the plasmid template, which was addition-
ally validated. For each reaction condition, three biolog-
ical replicates were performed, each with three technical
replicates. Each plate contained ‘no template’ and ‘no re-
verse transcriptase’ controls. The threshold Ct values were
calculated automatically by CFX manager, and the rela-
tive expression of genes of interest was defined by the
Ct
method.
In vivo
recombination assay
Chemically competent wild-type or
recB
-minus MG1655
were co-transformed with two plasmids, the edit-
ing plasmid pDE351 and variants of the expres-
sion plasmid (pBAD
CbAgo, pBAD
CbAgo
lacI,
pBAD
dDD
CbAgo
lacI, pBAD
dYK
CbAgo
lacI or
pBAD
lacI, Supplementary Table S4). Some double
transformants were obtained via two-step transformation
(first with pBAD
CbAgo and second with pDE351).
Co-transformed cultures were plated on LB agar plates
supplemented with Amp, Sp, Cm and Glc (to prevent
premature CbAgo expression). Individual colonies were
inoculated in 6 ml of LB containing Amp, Cm, Sp and
Glc and grown overnight. 1 ml aliquots of cell cultures
was mixed with 1 ml of 50% sterile glycerol and 100–200
l aliquots were frozen in liquid nitrogen and stored at
−
70
◦
C to use as start cultures in further experiments.
The remaining 5 ml were used for miniprep plasmid DNA
purification using Zymo Research D4020 kit. To confirm
the presence of both plasmids, the samples were treated
with specific restriction endonucleases Bsp1407I (linearizes
pDE351) and BamHI (linearizes pBAD
CbAgo and its
variants) and the products were analyzed by agarose gel
electrophoresis.
For recombination experiments, the starting cultures
from
−
70
◦
C were defrosted at room temperature and 5
l
were inoculated into 1 ml LB (in 25 ml tubes) supplemented
with Ara, Amp and Cm, or Glc, Amp and Cm in control ex-
periments, and grown at 30
◦
C for 24 h. Amp and Cm were
omitted in some experiments, when indicated. Aliquots (10
l) of cell cultures were washed with 0.3 ml LB + Glc to
remove Ara and prevent further CbAgo expression, inocu-
lated into 0.5 ml LB + Glc in a 1.5 ml safe-lock tube and
incubated overnight at 43
◦
C to induce loss of the pDE351
plasmid. This procedure was repeated 6 times in total (
∼
12
h each), using 1
l of cell cultures from the previous pas-
sage. Cells from the last sixth passage were used for CFU
counting on LB, LB + Cm, LB + Sp and LB + Cm + Sp
plates. Eight serial 10-fold dilutions were prepared in LB
media for each culture in 96 well plates and then 10
lof
each dilution was placed on all plates. The plates were air-
dried and incubated overnight at 37
◦
C, followed by colony
counting. Statistical significance of the observed changes in
the frequencies of antibiotic resistant cells was calculated
using the Mann–Whitney
U
test.
Analysis of individual clones and southern blot hybridization
For analysis of individual antibiotic-resistant clones, re-
combination experiments were performed in the same
way with pBAD
CbAgo
lacI, pBAD
dDD
CbAgo
lacI or
pBAD
lacI plasmids and the pDE351 plasmid was cured
for six passages at 43
◦
C. Three biological replicates were
performed in each case. For counting of the proportion of
Sp
R
clones, independent Cm
R
clones were isolated from
these cultures by spreading them on LB plates with Cm and
Glc, followed by testing of single colonies on plates contain-
ing Sp and Glc.
For Southern blot hybridization, total bacterial cultures
or individual Cm
R
Sp
S
clones obtained in the experiment
with the pBAD
CbAgo
lacI plasmid were grown overnight
at 37
◦
C in LB containing Cm and Glc and total DNA
(chromosomal plus plasmid) was purified using the GenE-
lute
™
Bacterial Genomic DNA kit (Sigma-Aldrich). Con-
trol experiments demonstrated that plasmid DNA is fully
recovered with this kit. Purified DNA (2
g) was digested
with Bsp19I and HindIII (Sibenzyme) overnight and then
mixed with 5
×
loading buffer (8M Urea, 1% NP-40, 1 mM
Tris pH 8.0, 10% NEB purple loading dye no SDS) and
incubated at 98
◦
C for 2 minutes. The samples (0.5
gof
DNA) were immediately loaded onto 1
×
TAE 1M urea 1%
agarose gel and separated for 4 h at 80 V. The gel was in-
cubated in hybridization buffer (0.4 M NaOH, 1 M NaCl)
for 1 h and DNA was transferred to nylon membrane (Hy-
bond N+) in a sandwich assembly containing the agarose
gel, the membrane, napkins, and BF2 and BF3 filter pa-
per in the hybridization buffer. The transfer was carried out
under 1 kg pressure for 20 h with 2–3 changes of napkins.
The membrane was washed in 2
×
SSC for 5 min twice un-
til almost neutral pH, allowed to dry on air for 10 min and
UV-crosslinked with 0.124 J using Biometra transillumina-
tor BLX-254. The crosslinked membrane was placed in a
roller bottle and incubated in Thermofisher prehybridiza-
tion ultrasensitive buffer AM8670 (10 ml) at 42
◦
C for 1
h with rotation. A P
32
-labeled probe corresponding to the
4090
Nucleic Acids Research, 2023, Vol. 51, No. 8
cat
gene was prepared by PCR with primers labeled with
-
P
32
ATP and PNK. The probe was denatured for 2 min at
98
◦
C, placed on ice for 2 min and added into the roller bot-
tle. Hybridization was performed at 44
◦
C for 20 h with ro-
tation. The membrane was washed in 0.1
×
SSC + 0.1% SDS
for 5 min twice, dried, placed in a P
32
-sensitive screen and
analyzed by phosphorimaging in 3 days.
Genomic DNA sequencing
Total DNA (chromosomal plus plasmid) was purified from
the total bacterial culture or two individual Cm
R
Sp
S
clones
obtained in the experiment with the pBAD
CbAgo
lacI
plasmid as described above. Purified DNA (500 ng) was
treated with NEBNext
®
Ultra
™
II FS DNA Library
Prep Kit for Illumina (E7805S) and barcoded using NEB-
Next Multiplex Oligos for Illumina
®
(Dual Index Primers
Set 1, E7600). DNA libraries were sequenced using the
HiSeq2500 platform (Illumina) in the rapid run mode (50-
nucleotide single-end reads). The resulting DNA reads were
mapped to the chromosomal and plasmid sequences (the
MG1655 genome containing the
cat
gene instead of
lacI
,
pBAD
CbAgo
lacI and pDE351 plasmids) as described
above for smDNA libraries, allowing up to 2 mismatches
with the reference sequences. Multi-mappers corresponding
to more than one replicon (genome, pBAD, pDE351) were
removed from the analysis. The resulting assemblies were
manually inspected for the presence of mismatches with the
reference genomic sequence in the region of homology with
the editing plasmid.
RESULTS
Genome targeting by CbAgo depends on homology between
plasmid and chromosomal DNA
Previous experiments demonstrated that during its expres-
sion in
E. coli
, CbAgo is preferentially loaded with guide
smDNAs from plasmids and can use them to introduce
double-strand breaks (DBSs) in homologous genomic loci
(
21
). Here, to determine the homology requirements for
DNA interference, we analyzed the dependence of chromo-
somal DNA targeting by CbAgo on the length of homology
region in plasmid DNA and on the presence of mismatches
in this region. We performed experiments in
E. coli
strains
transformed with a series of expression plasmids encoding
CbAgo and containing chromosomal genes that could be
used for generation of guide smDNAs.
To determine the minimal length of the homology region
required to induce DNA interference between plasmid and
genomic DNA, we cloned fragments of the chromosomal
lacI
gene into pBAD-based vectors used for expression of
CbAgo (full-length 1083 bp
lacI
or its 50, 100, 200, 300,
450, 600 bp 5
-fragments). In addition, the plasmids con-
tained the
araC
gene (876 bp) also present in the chromo-
some. We transformed
E. coli
with these plasmids, induced
expression of CbAgo by Ara in the presence of a selective
antibiotic (Amp, to maintain the plasmid), purified the pro-
tein, and isolated and sequenced associated smDNAs (Fig-
ure
1
A, Supplementary Information Figure S1).
Analysis of the level of CbAgo expression by Western
blotting with parallel measurements of the number of live
Figure 1.
Targeting of plasmid and chromosomal DNA by CbAgo during
DNA interference. (
A
) Scheme of the DNA interference assay. Plasmid en-
coding CbAgo,
araC
and fragments of the
lacI
gene of various lengths is
used by CbAgo as a source of smDNAs to attack homologous genomic
regions, followed by their processing and loading of new smDNAs into
CbAgo. (
B
) Distribution of CbAgo-associated smDNAs over the
E. coli
chromosome in the case of the plasmid containing full-length
lacI
.The
amounts of smDNAs are shown in reads per kilobase of genomic DNA
per million aligned reads in the library (RPKM); reads from the plus and
minus genomic strands are shown in green and pink, respectively. Positions
of the
ori
,
terA
and
terC
sites,
araC
,
lacI
genes and ribosomal RNA oper-
ons (R) are indicated.
bacterial cells in the culture demonstrated that at least
20 000 molecules of CbAgo were present in each cell under
these conditions (Supplementary Figure S1A, B). This level
of expression is comparable to abundant housekeeping bac-
terial proteins (
26
,
27
). Nucleic acids bound to CbAgo were
isolated by one-step purification of His
6
-tagged CbAgo us-
ing Co
2+
-affinity resin (Supplementary Figure S1B). Anal-
ysis of purified nucleic acids by gel-electrophoresis demon-
strated that CbAgo was primarily associated with
∼
15–20
nt smDNAs, in agreement with previous reports (Supple-
mentary Figure S1C) (
14
,
21
). Accordingly, the majority of
smDNA reads obtained after high-throughput sequencing
were in the range of 15–20 nucleotides (Supplementary Fig-
ure S1D).
Mapping of smDNAs to the plasmid and genome se-
quences (excluding the regions of homology) showed that
CbAgo has a preference for smDNAs of plasmid origin, as
previously reported (
21
). After accounting for the plasmid
copy number,
∼
3.5–5-fold enrichment of plasmid-derived
smDNAs is observed in all smDNA libraries (Supplemen-
tary Table S1). SmDNAs are produced from both plasmid
strands and are evenly distributed along the whole plasmid
sequence (Supplementary Figure S2). This potentially al-
lows targeting of specific chromosomal regions by CbAgo
loaded with guide smDNAs generated from homologous
plasmid-encoded sequences.
Analysis of the chromosomal distribution of smDNAs re-
vealed their enrichment at the sites of replication termina-
tion,
terA
and
terC
(Figure
1
A, Supplementary Figure S3).
Similarly, SeAgo and TtAgo were shown to target the
ter
re-
gion in
Synechococcus elongatus
and
Thermus thermophilus
(
17
,
28
). Our previous analysis of CbAgo demonstrated that
Nucleic Acids Research, 2023, Vol. 51, No. 8
4091
Figure 2.
Homology requirements for genomic DNA cleavage by CbAgo. (
A
) Cooperation between CbAgo and RecBCD in processing of chromosomal
DSBs during DNA interference (
21
). Guide-loaded CbAgo makes DSB in the target locus, which is further processed by RecBCD, CbAgo and possibly
other nucleases until closest Chi sites, resulting in biogenesis of additional smDNAs loaded into CbAgo. (
B
) Analysis of CbAgo-associated smDNAs
corresponding to the target genes in
E. coli
strains (MG1655) containing plasmids with fragments of the
lacI
gene. Strand-specific distribution of smDNAs
around the chromosomal
araC
(left) and
lacI
(right) genes (indicated with black dashes) is shown for ‘no
lacI’
, 0.45 kb
lacI
and full-length
lacI
libraries.
Reads from the plus and minus genomic strands are shown in green and pink, respectively. The distribution of Chi sites (
) around the target genes is
shown at the bottom (rightward sites, green; leftward sites, pink). The closest Chi sites surrounding the target genes in the proper orientation are i
ndicated
with dotted lines (forward for the plus strand and reverse for the minus strand). (
C
) Enlarged views of the
lacI
locus for the same smDNA libraries (see
Figure S4B for all libraries). (
D
) The fraction of smDNAs corresponding to the
araC
(carmine) and
lacI
(turquoise) loci depending on the length of the
plasmid
lacI
fragment. The number of smDNA reads was calculated for regions between four innermost Chi sites around
araC
or
lacI
(indicated with gray
rectangles in panel B) and divided by the total number of smDNAs mapped to the
E. coli
chromosome.
biogenesis of smDNAs in the
ter
region depends on pro-
cessing of double-strand DNA ends formed in this region
during replication by the helicase-nuclease RecBCD (
21
). In
addition, smaller peaks of smDNAs were observed around
the seven rRNA operons in the
E. coli
genome (Figure
1
B, Supplementary Figure S3), likely as a result of target-
ing of multicopy sequences by CbAgo described previously
(
21
).
The highest smDNA peaks outside of the
ter
region were
observed around the chromosomal copies of the
lacI
and
araC
genes, in the case of the expression plasmid that con-
tained full-length
lacI
and
araC
(Figures
1
Band
2
Band
Supplementary Figure S3). Note that the target genes them-
selves were excluded from smDNA mapping because of
ambiguity in attributing smDNA reads to their plasmid
or chromosomal copies; this resulted in an apparent de-
crease in the number of reads inside these genes. The ge-
nomic smDNA peaks around
lacI
and
araC
were much
wider than the target genes, including dozens or even hun-
dred kilobases from one or both sides (Figure
2
B, Supple-
mentary Figure S4). The distribution of smDNAs around
the target loci was asymmetric, with most smDNAs gener-
ated from the 3
-terminated DNA strands facing the target
gene (i.e. top ‘green’ strand from the left of the gene and
bottom ‘pink’ strand from the right of the gene). Further-
more, sharp drops in the amounts of smDNAs were ob-
served on each side of the peaks at several (one to three)
closest Chi-sites (the 5
-GCTGGTGG-3
motifs) oriented
toward the target gene (Figure
2
B, C and Supplementary
Figure S4). This is a molecular signature of DNA pro-
cessing by RecBCD, which unwinds DNA ends formed at
DSBs and performs asymmetric processing of the two DNA
strands, either itself or in cooperation with other nucleases,
until the recognition of Chi sites (Figure
2
A) (
22
,
23
). As
we have previously shown, the catalytic activity of CbAgo
is required for targeting of homologous chromosomal re-
gions (
21
), suggesting that CbAgo performs initial cleavage
of the target genes,
lacI
and
araC
, using plasmid-derived
guides corresponding to both DNA strands and thus pro-
ducing DSBs in these regions (Figure
2
A). The observed
asymmetry in the distribution of smDNAs between the two
DNA strands and its dependence on Chi sites suggests that
4092
Nucleic Acids Research, 2023, Vol. 51, No. 8
Figure 3.
Engineered DSB in plasmid enhances DNA interference with the chromosome. (
A
) Scheme of the DNA interference assay. The experiments
were performed in
E. coli
strains (DE160, BW27784 background) expressing I-SceI and CbAgo from the chromosome and containing plasmids with a
single genome homology region, the
yffP
locus (300 or 1000 bp), with or without an adjacent I-SceI site. (
B
) Analysis of chromosomal distribution of
CbAgo-associated smDNAs in the case of a plasmid containing the I-SceI site and a 1000 bp
yffP
homology region. Positions of the
yffP
locus (2.56
Mb),
terA
and
terC
sites (T), I-SceI (S), and CbAgo (A) genes are shown. Positions of smDNA peaks resulting from genomic DNA cleavage by I-SceI are
indicated with lilac arrowheads; orientation of I-SceI-like motifs is shown with black arrowheads below the plot (see Fig. S6C for genomic coordinat
es).
(
C
) Strand-specific distribution of smDNAs around the
yffP
gene (indicated with a black dash) for
E. coli
strains containing plasmids without the I-SceI
site with 300 bp
yffP
(top), with the I-SceI site with 300 bp
yffP
(middle), and with the I-SceI site with 1000 bp
yffP
(bottom). Reads from the plus and
minus genomic strands are shown in green and pink, respectively. The distribution of Chi sites (
) around the target genes is shown at the bottom (forward
sites, green; reverse sites, pink). The closest Chi sites surrounding the target genes in the proper orientation (forward for the plus strand and reve
rse for
the minus strand) are indicated with dotted lines. The fraction of smDNA reads produced from the
yffP
locus was calculated by dividing the number of
smDNAs mapped between four innermost Chi sites around
yffP
(indicated with gray rectangles) by the total number of smDNAs mapped to the
E. coli
chromosome. See Fig. S6B for enlarged views of the
yffP
locus.
further processing of chromosomal DNA in the target re-
gions is performed by RecBCD, possibly in cooperation
with CbAgo and other nucleases (Figures
1
Aand
2
A).
The size of the
araC
peak was fairly constant in all ana-
lyzed strains (Figure
2
B, D and Supplementary Figure S4).
In contrast, the
lacI
peak was absent in the case of plasmids
with
lacI
fragments
<
300 bp or lacking
lacI
, was barely visi-
ble with a 300 bp
lacI
fragment, and was gradually increased
in the case of plasmids with 450 bp, 600 bp and full-length
lacI
(Figure
2
B–D and Supplementary Figure S4). This in-
dicates that
>
300 bp region of homology is required to in-
duce genomic DNA cleavage during DNA interference, and
that the efficiency of DNA interference directly correlates
with the fragment length.
To test how genomic DNA targeting by CbAgo depends
on homology between the plasmid and chromosomal genes,
we recoded the plasmid
lacI
gene by introducing single-
nucleotide substitutions along its whole sequence (one sub-
stitution per 5.5 nucleotides on average, 198 substitutions
in total) (Supplementary Figure S5). No smDNA peak was
observed around genomic
lacI
in the case of
E. coli
strain
containing this plasmid, indicating that high level of ho-
mology between plasmid and target genes is required to in-
duce DNA interference (Figure
2
B, D, Supplementary Fig-
ure S4).
Engineered double-strand breaks stimulate plasmid DNA tar-
geting by CbAgo
Processing of DSBs by RecBCD generates smDNAs bound
by CbAgo and may therefore enhance DNA interference
(
21
). To test whether plasmid targeting by CbAgo can be
stimulated by DSB formation, we introduced the recogni-
tion site of the I-SceI meganuclease in pBAD-based plas-
mids containing either 300 bp or 1000 bp fragments of
the
E. coli yffP
operon, without any other regions of ho-
mology to the chromosome (Figure
3
A). To avoid com-
plete digestion of plasmid DNA and its loss from the bac-
terial population, we used a mutant version of the I-SceI
site with a decreased cleavage efficiency and maintained
strains in the presence of the selective antibiotic (see Mate-
rials and Methods). CbAgo and I-SceI were expressed from
the chromosome (strain DE160, Supplementary Table S5)
to prevent changes in protein expression because of plasmid
degradation. The level of CbAgo expression in this strain
was several-fold lower than in the case of plasmid-encoded
CbAgo in the strains used in previous experiments, as con-
firmed by Western-blotting (Supplementary Figure S1A).
We isolated smDNAs bound to CbAgo in
E. coli
strains
containing these plasmids and analyzed their genomic dis-
tribution.
Genomic smDNA profiles obtained in these experiments
looked highly similar in the case of different plasmid vari-
ants, except for the
yffP
gene area (Figure
3
B, C and Supple-
mentary Figure S6A). In the case of the plasmid with 300 bp
yffP
in the absence of the I-SceI cut site, only a small peak
of smDNAs was visible around
yffP
in chromosomal DNA
(Figure
3
C, top), in agreement with the results obtained for
the 300 bp fragment of
lacI
(see above). In the presence of
the I-SceI site, the size of the peak was notably increased
indicating that DSB in plasmid DNA enhances DNA in-
Nucleic Acids Research, 2023, Vol. 51, No. 8
4093
terference (Figure
3
C, middle). Furthermore, the peak was
dramatically expanded in the presence of the I-SceI site and
a 1000 bp fragment of the
yffP
operon, with its size being
comparable with the peaks at
ter
sites (Figure
3
BandC,
bottom). The pattern of smDNA distribution at the
yffP
lo-
cus was similar to the peaks around the
lacI
and
araC
genes
discussed above, with asymmetric processing of the two
DNA strands at the opposite sides of
yffP
, which was simi-
larly dependent on Chi sites (Figure
3
C and Supplementary
Figure S6B). The fraction of smDNA reads mapped to the
yffP
locus between several closest Chi sites in these three
strains was 0.0105, 0.0229 and 0.0997, respectively (Fig-
ure
3
C). The increased cleavage of chromosomal DNA in
the presence of the plasmid I-SceI site correlated with in-
creased loading of plasmid-derived smDNAs into CbAgo
(16.7 and 16.9-fold enrichment for the plasmids contain-
ing the I-SceI site and 300 or 1000 bp fragments of
yffP
,
in comparison with 6.9-fold enrichment for the plasmid
lacking the I-SceI site) (Supplementary Table S1). These
results confirm that chromosomally encoded CbAgo can
induce DNA interference between plasmid and chromoso-
mal DNA (
21
), and show that engineered DSBs in plasmid
DNA stimulate generation of smDNAs bound by CbAgo,
thus increasing cleavage of the homologous chromosomal
locus.
Analysis of CbAgo-associated smDNAs reveals off-target ac-
tivity of I-SceI
Analysis of the chromosomal distribution of smDNAs as-
sociated with CbAgo in
E. coli
strains expressing I-SceI re-
vealed several smaller peaks of smDNAs in addition to the
major peaks at the
yffP
locus and
ter
sites (indicated with
lilac arrowheads in Figure
3
B). Similarly to gene-specific
smDNA peaks discussed in the previous sections, these
peaks had asymmetric distribution of smDNAs between the
two genomic strands, with more smDNAs produced from
the ‘top’ (green) strand at the left sides of the peaks and
from the ‘bottom’ (pink) strand at the right sides of the
peaks (Figure
4
B). Furthermore, the outer borders of the
peaks were defined by Chi sites oriented toward the cen-
ter of each peak (Figure
4
B), suggesting that these smD-
NAs were generated during RecBCD-dependent processing
of DSBs formed within these peaks. Indeed, careful analysis
of genomic DNA sequences in these sites revealed an I-SceI-
like motif in the middle of each peak, which contained four
substitutions in comparison with the wild-type sequence
(Figure
4
A). This motif was found within 10 copies of the
IS5 element spread in the
E. coli
genome (8 in the direct
orientation and 2 in the reverse orientation, indicated with
black arrowheads in Figure
3
B).
Analysis of smDNA coverage around the I-SceI-like mo-
tifs shows that smDNAs are preferentially generated from
3
-terminated DNA strands, starting exactly from the sites
of I-SceI cleavage (Figure
4
B). This pattern of smDNA
distribution suggests that, despite the presence of several
substitutions in the recognition sequence, these sites can
be recognized and cleaved by the I-SceI meganuclease, fol-
lowed by their processing by RecBCD and other nucle-
ases. The efficiency of this cleavage and the abundance of
smDNAs at the I-SceI-like motifs is much lower than in
the case of
E. coli
strains with a wild-type chromosomal I-
SceI site or its mutant variant with just two substitutions
studied previously (
21
), however it can still be detected by
CbAgo.
CbAgo does not affect transcription of target genes and cell
growth
Since CbAgo loaded with plasmid-derived guides targets
genomic DNA, it could be expected that it might affect
transcription of the chromosomal genes and delay cell
growth, due to the appearance of DSBs and ongoing DNA
repair. To check whether loading of CbAgo with gene-
specific smDNAs could affect expression of the target genes,
we measured their mRNA levels by quantitative reverse
transcription-PCR using genome-specific primers (Supple-
mentary Figure S7A). We compared
E. coli
strains contain-
ing pBAD plasmids with and without the CbAgo and
lacI
genes, grown under identical conditions in the presence of
the CbAgo inductor (arabinose). It was found that simulta-
neous presence of CbAgo and
lacI
in the plasmid did not
significantly change the mRNA levels for the
lacI, lacZ
and
araC
genes, measured in the same conditions that were used
for smDNA analysis (Supplementary Figure S7B). As ex-
pected, experiments with a catalytically inactive mutant of
CbAgo (dDD CbAgo) with alanine substitutions of two
catalytic aspartates also revealed no changes in the mRNA
levels of the target genes. Thus, targeting of the genomic
lacI
and
araC
loci by CbAgo does not affect the expression lev-
els of the target chromosomal regions, at least at this stage
of cell growth.
We further tested whether expression of CbAgo could
affect cell growth. It was found that
E. coli
MG1655
strains containing an empty pBAD plasmid, or pBAD with
CbAgo, or pBAD with CbAgo and
lacI
genes had similar
growth kinetics (Supplementary Figure S8A). This suggests
that CbAgo-induced DNA interference between the homol-
ogous plasmid and chromosomal genes (
lacI
and
araC
) does
not significantly affect the viability of
E. coli
, at least un-
der conditions of our experiments. Similar results were ob-
tained previously with BL21-based strains with plasmid ex-
pression of CbAgo (
21
). Similarly, the
E. coli
strain DE160
with chromosomal expression of CbAgo and I-SceI, used
in experiments with I-SceI, had normal growth kinetics
(Supplementary Figure S8B). This indicates that cleavage
of multiple I-SceI-like motifs in the chromosome is not
toxic, likely because they are cleaved with low efficiency
and can be successfully repaired. In contrast, cell growth
was severely affected when I-SceI was expressed in an
E.
coli
strain containing a wild-type I-SceI site in the chro-
mosome (strain DL2917). This effect was exaggerated when
I-SceI was expressed together with CbAgo (strain DE159)
(Supplementary Figure S8B), likely because CbAgo in-
creases chromosomal DNA processing at the sites of DSBs
(
21
). Together, these results suggest that either CbAgo-
induced DSBs are rapidly repaired without affecting tran-
scription and cell division, or that formation and process-
ing of DSBs may occur at an earlier stage of cell growth
(prior to CbAgo purification), or that DSBs may be formed
in only a fraction of all cells in the bacterial culture (see
Discussion).