of 30
Structural Basis for Eukaryotic Transcription-Coupled DNA
Repair Initiation
Jun Xu
1,†
,
Indrajit Lahiri
2,†
,
Wei Wang
1
,
Adam Wier
2
,
Michael A. Cianfrocco
2
,
Jenny
Chong
1
,
Alissa A. Hare
4
,
Peter B. Dervan
4
,
Frank DiMaio
5
,
Andres E. Leschziner
2,3,*
, and
Dong Wang
1,2,*
1
Division of Pharmaceutical Sciences, Skaggs School of Pharmacy & Pharmaceutical Sciences,
University of California San Diego, La Jolla, CA 92093
2
Department of Cellular & Molecular Medicine, School of Medicine, University of California San
Diego, La Jolla, CA 92093
3
Section of Molecular Biology, Division of Biological Sciences, University of California San Diego,
La Jolla, CA 92093
4
Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena,
CA 91125
5
Department of Biochemistry, University of Washington, Seattle, WA 98195
Abstract
Eukaryotic transcription-coupled repair (TCR), or transcription-coupled nucleotide excision repair
(TC-NER), is an important and well-conserved sub-pathway of nucleotide excision repair (NER)
that preferentially removes DNA lesions from the template strand blocking RNA polymerase II
(Pol II) translocation
1
,
2
. Cockayne syndrome group B protein in humans (CSB, or ERCC6), or its
yeast orthologs (Rad26 in
Saccharomyces cerevisiae
and Rhp26 in
Schizosaccharomyces pombe
),
is among the first proteins to be recruited to the lesion-arrested Pol II during initiation of
eukaryotic TCR
1
,
3
10
. Mutations in CSB are associated with Cockayne syndrome, an autosomal-
recessive neurologic disorder characterized by progeriod features, growth failure, and
photosensitivity
1
. The molecular mechanism of eukaryotic TCR initiation remains elusive, with
several long-standing questions unanswered: How do cells distinguish DNA lesion-arrested Pol II
from other forms of arrested Pol II? How does CSB interact with the arrested Pol II complex?
What is the role of CSB in TCR initiation? The lack of structures of CSB or the Pol II-CSB
complex have hindered our ability to answer those questions. Here we report the first structure of
Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research,
subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms
*
Correspondence and requests for materials should be addressed to D.W. dongwang@ucsd.edu or A.E.L. aleschziner@ucsd.edu.
These authors contributed equally to this work.
Author Contributions
J.X. prepared the proteins with help from W.W. and J.C. and performed the biochemical analyses. A.H. and P.D.B. provided the Py-Im
chemical agent. I. L. collected the EM data with help from A.W. I. L. performed data processing and refinement with help from
M.A.C. I.L. and F.D. generated the atomic models with homology models generated by J.X., W.W. and D.W. D.W. and A.E.L wrote
the manuscript with help from all lab members. D.W. and A.E.L directed and supervised the research.
The authors declare no competing financial interests.
HHS Public Access
Author manuscript
Nature
. Author manuscript; available in PMC 2018 May 22.
Published in final edited form as:
Nature
. 2017 November 30; 551(7682): 653–657. doi:10.1038/nature24658.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
S. cerevisiae
Pol II-Rad26 complex solved by cryo-electron microscopy (cryo-EM). The structure
reveals that Rad26 binds to the DNA upstream of Pol II where it dramatically alters its path. Our
structural and functional data suggest that the conserved Swi2/Snf2-family core ATPase domain
promotes forward movement of Pol II and elucidate key roles for Rad26/CSB in both TCR and
transcription elongation.
Pol II arrests at bulky, helix-distorting lesions are proposed to be the initiating signal for
TCR
1
,
2
,
9
,
11
13
. However, a number of non-lesion obstacles can also lead to Pol II arrest
13
,
14
.
How can cells distinguish among different forms of arrested Pol II and only commit those
that encountered genuine DNA lesions to TCR? Given that CSB is among the first proteins
to be recruited to arrested Pol II (Fig. 1a), we tested the ability of its
S. cerevisiae
ortholog
(Rad26) (Fig. 1b and Extended Data Fig. 1) to discriminate among Pol II stalls/arrests in
three representative scenarios: a non-damaged DNA containing an intrinsic pausing/arrest
sequence
15
, a non-covalent DNA binder
14
, or a bulky covalent DNA lesion, a genuine TCR
substrate
16
,
17
(Fig. 1c-e). Pol II alone stalled at all three translocation barriers. Rad26
facilitated its bypassing the A-tract and Py-Im translocation barriers, but not the CPD lesion,
consistent with a previous observation with human CSB
16
. An ATPase-dead Rad26 mutant
(K328R) failed to rescue Pol II. In contrast to Rad26, the transcription factor TFIIS failed to
rescue Pol II arrested by either Py-Im or a CPD lesion.
Using single-particle cryo-EM, we determined a 5.8 Å structure of a
S. cerevisiae
Pol II EC-
Rad26 complex (“Pol II-Rad26” hereafter), as well as a 6.4 Å structure of a Pol II EC from
the same sample, and built pseudo-atomic models using Rosetta (see Methods, Fig. 2 and
Extended Data Figs. 2 and 3). The structure of Pol II-Rad26 revealed that Rad26 binds to the
DNA upstream of Pol II EC (Figs. 2c, 2e, 3a, and 3c) and sits between Pol II’s clamp (Rpb2
side) and stalk (Rpb4/7) regions
18
, in agreement with our DNase I foot-printing assay (Fig.
3d). Most Pol II subunits adopt similar conformations in both complexes, with some local
changes at the interfaces between Pol II and Rad26 (Extended Data Fig. 3b-d).
We observed several unique structural features in the Pol II-Rad26 complex. Most dramatic
is the ~80-degree bending of the upstream DNA duplex region in the presence of Rad26
(Fig. 3b), which has not been reported for any structure involving Pol II
18
,
19
. In order to
establish that the bending is a consequence of Rad26 binding and not of lesion-induced
bending subsequently stabilized by Rad26, we solved the structure of a Pol II-EC with a
CPD lesion in the downstream DNA (see Methods) (Extended Data Fig. 4). In this structure,
lacking Rad26, the upstream DNA is not visible in the cryo-EM map, as was the case for the
Pol II EC with undamaged scaffold, indicating that Rad26 is responsible for bending the
upstream DNA. The Pol II-Rad26 map showed continuous density for a full transcription
bubble (Fig. 3a), in contrast to Pol II EC, where a significant portion of the non-template
strand (NTS) is disordered (Figs. 2c,d and 3b), as previously reported for other Pol II EC
structures
19
,
20
. This suggests that binding of Rad26 to Pol II EC may restrain the bubble’s
flexibility. The dramatic alteration of the potential interaction landscape of Pol II by Rad26
may facilitate the recruitment of downstream repair factors that assemble at the DNA lesion
site
1
.
Xu et al.
Page 2
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
The two RecA-like lobes of Rad26, which fit well into the density (Fig. 2c,e), bind the
upstream DNA duplex region and upstream fork of the transcription bubble (Fig. 3c) and
adopt an active closed conformation (Figs. 2e and 3c) similar to that of Rad54 and the
recently published Snf2-nucleosome complex
21
,
22
. We did not detect density for the N- and
C-terminal regions of Rad26 or the Rad26-specific loop insertions (Extended Data Fig. 1),
indicating that these regions are either disordered or adopt multiple conformations and were
averaged out during reconstruction.
Another unique feature of our structure is the insertion of a Rad26 HD2-1 “wedge” between
the DNA strands at the fork of an extended upstream transcription bubble (Fig. 3c,e). This
region is highly conserved among CSB family members, but not in other members of the
SWI/SNF superfamily (Extended Data Fig. 5). The affinity of Rad26 for Pol II EC increases
when the upstream fork of the transcription bubble is AT-rich (Extended Data Fig. 6a-d),
suggesting that weaker base pairing facilitates the interaction of Rad26 with Pol II EC and
bubble opening. A prediction from these data would be that extending the upstream fork of
the transcription bubble should increase the affinity of Rad26 for Pol II EC. Indeed,
mismatch-driven upstream bubble opening significantly increased the binding of Rad26 to
Pol II EC (Fig. 3f).
Rad26 also binds directly to Pol II (Fig. 3g), consistent with its residual binding affinity for a
Pol II EC containing a mini-scaffold with no protruding dsDNA (Extended Data Fig. 6e-h).
The major interaction interface between Rad26 and Pol II involves lobe 2 of Rad26 and the
Rpb2 subunit in Pol II (Fig. 3g). We also observed density in our cryo-EM map extending
between lobe 1 of Rad26 and both the coiled-coil domain in Rpb1 and the Rpb4/7 stalk.
Although we could not build a full model within those densities due to their lower
resolution, these Pol II regions are common docking sites for several transcription factors,
including the initiation factor TFIIE and the elongation factor Spt4/5
18
,
23
. Superposition of a
eukaryotic Pol II elongation complex containing Spt4 and Spt5
23
on our Pol II-Rad26
structure revealed significant steric clashes between Spt4/Spt5 and both Rad26 and the
upstream DNA (Extended Data Fig. 7)
23
. This suggests an important functional interplay
between Rad26 and other transcription factors during transcription and TCR through direct
competition for binding to Pol II. This explains the early observation that Rad26 antagonizes
the repression of TCR by Spt5 and Spt4
24
,
25
. Similarly, the overlap between the binding
sites of Rad26 and TFIIE accounts for the observation that Rad26 (CSB) is only required for
efficient TCR during elongation
1
, but not initiation, when its binding site would be occupied
by TFIIE.
Rad26 belongs to the SWI2/SNF2 family of ATP-dependent 3
to 5
single-stranded DNA
translocases. To understand which DNA strand Rad26 tracks in the Pol II-Rad26 complex,
we superimposed the structure of the yeast Snf2 protein from a recent cryo-EM structure of
the Snf2-nucleosome complex (PDB ID 5X0Y)
22
on the Rad26 ATPase domain in our
structure (Fig. 4a,b and Extended Data Fig. 8). Although our alignment was driven
exclusively by the protein moieties, both the proteins and the regions of DNA to which they
bind superimpose well (Fig. 4b and Extended Data Fig. 8). Snf2 tracks a single strand of the
nucleosomal DNA in a 3
to 5
direction (Fig. 4b). Given its fixed position relative to the
histone octamer, Snf2 effectively “pulls” on the DNA. Analogously, Rad26 would pull the
Xu et al.
Page 3
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
template strand away from Pol II (Fig. 4c), promoting Pol II forward translocation and
resulting in its bypassing certain translocation barriers (Fig. 1c,d)
16
,
26
. The direction of
Rad26 translocation is also consistent with the reported strand annealing activity for CSB
27
.
We tested this model directly by probing whether Rad26 could resolve Pol II backtracking in
an ATP-dependent manner (Fig. 5a,b). We used TFIIS-stimulated RNA cleavage by Pol II to
detect the presence of backtracking induced with a combination of a pausing sequence and
nucleotide removal. We observed a significant reduction in RNA cleavage products only in
the presence of wild-type Rad26 and dATP (Fig. 5c,d). Together, our data suggest that
Rad26 can promote the forward motion of Pol II in an ATP-dependent manner.
Several observations from our structure support the conservation of the Pol II-Rad26/CSB
complex from yeast to humans. Given the conservation in the core ATPase domain of Rad26
(Extended Data Fig. 1)
3
,
4
, we expect the structure of CSB’s core domain to be very similar.
The structural conservation between mammalian and yeast Pol II EC has been established
28
.
Finally, most of the Rad26-DNA and Rad26-Pol II interaction interfaces we identified are
highly conserved between yeast and humans.
Although the differences between prokaryotic and eukaryotic TCR are well-documented
1
,
our Pol II-Rad26 structure revealed some similarities between them: 1) Rad26 and the
prokaryotic transcription-repair coupling factor Mfd bind upstream of RNA polymerase and
facilitate its forward translocation to rescue transcriptional arrest. 2) Both Rad26 and Mfd
interact with the structurally conserved second-largest subunit of the polymerase (Rpb2 for
Pol II and beta subunit for RNAP), and essentially the same side of RNA polymerase
29
.
Our data suggest a unified mechanistic model for Rad26/CSB’s role in both TCR and
transcription (Fig. 5e). Rad26/CSB recognizes a stalled Pol II and can reduce its dwell time
by preventing backtracking, promoting Pol II forward translocation on non-damaged
templates, and increasing the chances of transcriptional bypass through less bulky DNA
lesions
16
,
26
, all of which stimulate transcription elongation (Fig. 5e)
16
. However, Rad26 fails
to promote efficient transcriptional bypass of bulky DNA lesions that lead to strong blockage
of translocation (such as CPD lesions)
17
(Fig. 1e). A comparison of our Pol II EC-Rad26
structure and a CPD-arrested Pol II EC showed a striking structural similarity in the core 10-
subunit Pol II region and the active sites (Fig. 4d-f), suggesting that a CPD lesion would
likely have no effect on the interactions between Pol II and Rad26/CSB. In agreement with
this, the binding affinities between Rad26 and a Pol II EC carrying either a non-damaged or
a CPD lesion-containing scaffold are indistinguishable (Extended Data Fig. 6i,j). These
observations suggest that only the interaction between Rad26/CSB and a Pol II persistently
arrested at a bulky lesion would lead to initiation of TCR (Fig. 5e).
The structure of the Pol II-Rad26 complex also provides insights into the roles of CSB in
DNA lesion recognition and verification in eukaryotic TCR initiation. TCR and global
genome nucleotide excision repair (GG-NER) have been suggested to have similar tripartite
lesion recognition and verification steps, with the first DNA lesion recognition step in TCR
being mediated by blockage of Pol II transcription instead of XPC (Extended Data Fig. 9)
30
.
However, the role of CSB in this lesion recognition step is not clear. We propose that CSB
Xu et al.
Page 4
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
(Rad26) plays a central role in the first DNA lesion recognition step (Fig. 5e) and presents
new protein interaction interfaces that could facilitate loading downstream repair factors
1
,
such as UVSSA, CSA, XPG, and TFIIH in humans (Extended Data Fig. 9). TFIIH would
use its XPB and XPD helicases to induce backtracking in Pol II and verify the presence of a
DNA lesion on the template strand (step 2). The final step, XPA-dependent lesion
verification, is expected to be the same for both GG-NER and TCR (step 3) (Extended Data
Fig. 9)
30
. A full understanding of the mechanistic details of this model awaits further
investigation.
Experimental methods
Protein expression and purification
The coding sequence of
Saccharomyces cerevisiae
Rad26 was cloned from
Saccharomyces
genomic DNA into a pGEX6p-1 based vector (GE healthcare USA). An N-terminal hexa-
histidine tag and
E. Coli
trigger factor protein were added to the construct to facilitate
protein expression and purification. A precision protease recognition sequence was inserted
between the trigger factor and Rad26. Rad26 mutations were generated by PCR using the
full-length Rad26 sequence as template. All Rad26 constructs were confirmed by DNA
sequencing.
Recombinant Rad26 proteins were expressed in
Escherichia coli
strain Rosetta 2(DE3)
(Novagen, USA). Cells were transformed and grown in LB at 37 °C to an OD
600
of 0.6, and
expression was induced by 0.1 mM IPTG for 16 h at 20 °C. The cells were lysed in buffer A
(20 mM Tris-HCl (pH 7.5), 500 mM NaCl, 5% glycerol, 1 mM 2-Mercaptoethanol). After
centrifugation, the supernatant lysate was applied to a HisTrap HP column (GE Healthcare,
USA) equilibrated in buffer B (buffer A plus 10 mM imidazole). The column was washed
with 20 column volumes of buffer B and eluted with buffer A containing 200 mM imidazole.
The eluate was then applied to a Hi-Trap Heparin column (GE Healthcare, USA),
equilibrated in buffer C (20 mM Tris-HCl (pH 7.5), 400 mM NaCl, 5% glycerol, 1 mM 2-
Mercaptoethanol) and eluted in buffer A with a linear gradient of 400 to 1000 mM NaCl.
The eluate was then applied to a Hi-Trap SP HP column (GE Healthcare, USA), equilibrated
in buffer D (20 mM Tris-HCl (pH 7.5), 250 mM NaCl, 5% glycerol, 1 mM 2-
Mercaptoethanol) and eluted in buffer A with a linear gradient of 250 to 1000 mM NaCl. To
further improve the purity, Rad26-containing fractions from the Hi-Trap SP HP column were
applied to a Superdex 200 10/300 GL size exclusion column (GE Healthcare, USA),
equilibrated in buffer E (20 mM Tris-HCl (pH 7.5), 500 mM NaCl, 5% glycerol, 1 mM
DTT). The Rad26-containing fractions were pooled, concentrated to 2 mg/ml and stored at
−80°C.
S. cerevisiae
12-subunit Pol II was purified essentially as previously described
31
,
32
. Briefly,
Pol II (with a recombinant protein A tag in the Rpb3 subunit) was purified by affinity
chromatography using an IgG column (GE Healthcare, USA), followed by further
purification using Hi-Trap Heparin and Mono Q anion exchange chromatography columns
(GE Healthcare, USA).
Xu et al.
Page 5
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
In vitro
transcription assay
Pol II elongation complexes were assembled essentially as previously described
14
,
33
.
Radioactively-labeled 10-mer RNA was annealed to the template strand (TS) DNA by
heating at 95 °C for 2 minutes followed by slow cooling to room temperature (23 °C). 10
pmol of Pol II was incubated with 4 pmol of RNA-DNA hybrid for 10 minutes at room
temperature (23 °C), and then 2 minutes at 37 °C. To this, 10 pmol of biotin-labeled non-
template strand (NTS) DNA was added and incubated for 5 minutes at 37 °C followed by
incubated for 20 minutes at room temperature (23°C). The assembled elongation complex
was incubated with 20 μl of Streptavidin magnetic beads (NEB, USA) for 30 minutes at
room temperature (23 °C) and subsequently washed with elongation buffer, EB (20 mM
Tris-HCl (pH 7.5), 5 mM MgCl
2
, 40 mM KCl, 5 mM DTT) followed by EB with 0.3 M
NaCl, EB with 1 M NaCl, EB with 0.3 M NaCl and finally EB. The efficiency of EC
assembly and bead association was estimated to be 20–50% based on the radioactivity of
bead-associated RNA. For transcription assays with Py-Im Polyamide roadblock, 0.8 μM of
Py-Im polyamides was incubated with the bead-associated elongation complexes in EB for 3
hours at room temperature. The beads were re-suspended in EB and used for transcription
assays.
All
in vitro
transcription was initiated by adding rNTPs mixture to a final concentration of 1
mM each. Additional 3mM dATP was also included for supporting Rad26 ATPase activity.
After 5 minutes, Rad26 or TFIIS (final concentration of 100 nM) was added to the reaction
mix and kept at 30°C. For transcription assay from CPD-lesion containing template, Rad26
or TFIIS were added at same time with NTP mixture to visualize the early pausing sites
(1/3-3 min). After adding Rad26 or TFIIS, reactions were allowed to continue for the desired
time and then quenched (after 1, 3, 10, 30, or 60 min) by adding an equal volume of quench-
loading buffer (90% formamide, 50 mM EDTA, 0.05% xylene cyanol and 0.05%
bromophenol blue). Samples were boiled for 10 min at 95 °C in quench-loading buffer, and
the product RNA transcripts were analyzed by denaturing PAGE (6 M urea). The gel was
visualized by phosporimaging and quantified using Image Lab software (BioRad, USA).
For the experiments using TFIIS as a probe to investigate whether Rad26 translocation helps
resolve induced backtracking in Pol II (Figure 5), 1 mM rNTPs was first added to Pol II
elongation complex with an A-tract template to start transcription. After 20 min of
transcription extension, the rNTPs were removed by washing the resin three times to
generate backtracked Pol II. 200 nM Rad26 and 3 mM dATP were then added to the reaction
mix, which was incubated at 30°C for 5 min. Finally, 100 nM TFIIS was added and the
reaction incubated at 30°C for 1, 2, 5, or 10 min before being quenched by adding an equal
volume of quench-loading buffer. Samples were boiled for 10 min at 95 °C in quench-
loading buffer, and the product RNA transcripts were analyzed by denaturing PAGE (6 M
urea). The gel was visualized by phosporimaging and quantified using Image Lab software
(BioRad, USA).
DNA and RNA oligonucleotides and scaffolds used in this study
See Supplementary Tables 1 and 2.
Xu et al.
Page 6
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
Electrophoretic mobility shift assay (EMSA)
To examine the formation of Pol II-Rad26 complexes, an aliquot of 20 nM transcription
scaffold with radiolabeled RNA was incubated with 50 nM Pol II in the binding buffer (20
mM Tris, pH 7.5, 5 mM MgCl
2
, 5 mM DTT, 40 mM KCl, 50 mM NaCl, 5% Glycerol, 0.1
mg/ml BSA) at 23 °C for 10 min to form the elongation complex. Rad26 was added at
specified concentrations and the reactions were incubated for an additional 30 minutes at
23 °C. The reactions were then run on a 4.5% native PAGE in TBE buffer (pH 8.0) with 2
mM MgCl
2
for 2.5 h at 4 °C. Labeled Pol II EC and Pol II-Rad26 complexes were
visualized by phosporimaging and quantified using Image Lab software (BioRad, USA).
DNase I footprinting
An aliquot of 20 nM Pol II EC (with 5 ́-P
32
labeled template DNA strand) was incubated
with 0-150 nM Rad26 in the binding buffer (see above) at 23 °C for 30 min. Then DNase I
(NEB, USA) was added to a final concentration of 0.04 units/ml and the digestion was
carried out for 1 minute (50 seconds if Rad26 was absent) at 23 °C. The reactions were
stopped by addition of 10 μl quench-loading buffer. DNA products were heat-denatured at
95 °C for 5 min and separated by 7% denaturing PAGE gel. Labeled DNA products were
visualized by phosphorimaging and quantified using Image Lab software (BioRad, USA).
Preparation of the Pol II-Rad26 complex for electron microscopy
PAGE-purified RNA oligonucleotides were purchased from Dharmacon, template and non-
template DNA oligonucleotides were obtained from IDT. HPLC-purified CPD lesion-
containing template was purchased from TriLink. The RNA, template DNA (non-damaged
or CPD lesion containing) and non-template DNA were annealed to form the scaffold as
described above. To form the Pol II EC, Pol II and the scaffold were incubated in elongation
buffer (20 mM Tris (pH 7.5), 40 mM KCl and 5 mM dithiothreitol (DTT)). To form the Pol
II-Rad26 complex, Rad26 was added to Pol II EC and incubated for 30 min at 23 °C. The
final buffer composition was 20 mM Tris-HCl (pH 7.5), 5 mM MgCl
2
, 10 mM DTT, 40 mM
KCl, 100 mM NaCl and 2% glycerol and the final concentrations of the different
components were 330 nM Pol II, 600 nM Rad26, 300 nM template DNA, 330 nM non-
template DNA and 360 nM RNA. The same procedure was used to assemble Pol II EC
containing a site-specific CPD lesion (Pol II EC (CPD)), except with the following
modification: To increase the randomness of Pol II EC (CPD) particle orientations on the
EM grid, the Pol II EC (CPD) complex (1.2 μM) was crosslinked with 1 mM BS3 (Thermo)
for 30 min at 23 °C, then quenched with 50 mM ammonium bicarbonate. The excess BS3
was then removed by overnight dialysis. The final concentration of the Pol II EC (CPD)
complex was 900 nM.
The sequences used for EC preparation are as follows: template DNA,
5 ́-CGCTCTGCTCCTTCTCCCATCCTCTCGATGGCTATGAGATCAACTAG-3 ́;
CPD lesion-containing template DNA,
5 ́-CGCTCTGCTCCTTCTCCXXTCCTCTCGATGGCTATGAGATCAACTAG-3
́(XX = CPD lesion); non-template DNA,
Xu et al.
Page 7
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
5 ́-CTAGTTGATCTCATATTTCATTCCTACTCAGGAGAAGGAGCAGAGCG-3 ́;
RNA, 5 ́-AUCGAGAGGA-3 ́.
Electron microscopy
An aliquot of four μl of the sample were applied to glow-discharged UltraAuFoil R 1.2/1.3
holey gold grids (Quantifoil Micro Tools GmbH, Germany) coated with a thin carbon layer
for Pol II EC-Rad26 and Quantifoil 1.2/1.3 holey carbon grids for Pol II EC (CPD). The
grids were blotted and plunge-frozen in liquid ethane using a Vitrobot (FEI, USA).
Automated data collection was performed using Leginon
34
on a FEI Talos Arctica (FEI,
USA) operated at 200 kV, equipped with a Gatan K2 Summit direct detector (Gatan Inc.,
USA). For the Pol II EC-Rad26 sample, 8,026 movies were recorded in ‘super-resolution
mode’ at a dose rate of 11.2 electrons/pixel.sec with a total exposure time of 6 s sub-divided
into 200 ms frames, for a total of 30 frames. The images were recorded at a nominal
magnification of 36,000X resulting in an object-level pixel size of 1.2 Å/pixel (0.6 Å/super-
resolution pixel). The defocus range of the data was −0.5 μm to −5 μm. The Pol II EC (CPD)
sample adopted a strongly preferred orientation on the grid. To increase the number of
orientations, two datasets were collected, with and without bis(sulfosuccinimidyl)suberate
(BS3, ThermoFisher Scientific)
23
,
35
,
36
. A total of 3,690 micrographs were recorded using
the above-mentioned parameters except that the total exposure was 7.5 s instead of 6 s.
Image processing
The movie frames were aligned using a pre-release version of MotionCor2
37
and the dose-
weighted frame alignment option. The aligned micrographs were then manually inspected
and unsuitable micrographs (having defects like broken carbon, thick ice, etc.) were
discarded. CTF estimation was performed on the non-dose-weighted aligned micrographs
using CTFFIND3
38
and micrographs having a 0.5 confidence resolution for the CTF fit
worse than 8 Å (as determined in Appion) were excluded from further processing. For an
initial reconstruction, DoG Picker
39
was used to select particles from the dose-weighted
micrographs in a template-independent manner. Relion 1.4
40
,
41
was used for the initial
round of 3D reconstruction. One more round of particle picking was performed using
FindEM
42
with 2D averages from the initial processing serving as templates and the particle
picks from this final round were used for further processing. CTF estimation and particle
picking were performed within the framework of Appion
43
.
All subsequent processing was done using a pre-release version of Relion 2
44
installed on
Amazon Web Services (Ami id: ami-9caa71fc)
45
. Two-dimensional classification was
performed to identify “bad” particles. Only those particles that contributed to “good” 2D
class averages were used for further processing. Following 2D classification, an initial 3D
classification was performed using a Pol II EC model (PDB ID: 1Y77)
46
as reference and
only those particles corresponding to classes showing a strong additional density when
compared to EC were selected. The 2D and initial 3D classifications were carried out using
4X binned data (4.8 Å/pixel). For the rest of the steps unbinned images (1.2 Å/pixel) were
used. The detailed 3D classification and refinement scheme is shown in Extended Data Fig.
2d. The resolutions of the cryo-electron microscopy (cryo-EM) maps were estimated from
FSC curves calculated using the gold-standard procedure and the resolutions are reported
Xu et al.
Page 8
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
according to the 0.143 cutoff criterion
47
,
48
. FSC curves were corrected for the convolution
effects of a soft mask applied to the half maps by high-resolution phase randomization
49
.
The density maps were corrected for the modulation transfer function (MTF) of the detector
and were sharpened with an automatically estimated negative B factor as implemented in the
“post-process” routine of Relion.
In our initial EM map (Extended Data Fig. 2d, 4.5 Å Pol II EC-Rad26 complex structure)
the density for Rad26 was more fragmented compared to the rest of the complex. In order to
identify a subset of particles with strong density for Rad26, the signal for the rest of the EC
was computationally removed and then focused classification was performed using a mask
for Rad26 generated from the initial map
50
. For further processing only those classes that
showed a clear upstream DNA density and the complete density for the Rad26 ATPase
domain were selected. 19,231 particles (4% of the total particles retained after 2D
classification and 12% of the particles assigned to Pol II EC-Rad26 after the initial 3D
classification) were used for the final cryo-EM map of Pol II EC-Rad26 and the map
reached an overall resolution of 5.8 Å. Local resolution of the map was estimated using the
“local-resolution” routine in Relion2 and was used to locally filter the maps. A cryo-EM
map of a Pol II EC was also reconstructed from the same dataset using the
in silico
sorting
scheme outlined in Extended Data Fig. 2d. The Pol II EC was reconstructed from 24,300
particles and the map reached an overall resolution of 6.4 Å.
For the Pol II EC (CPD) datasets CTFFIND4
51
was used for CTF estimation and 2D
averages of the Pol II EC-Rad26 complex were used as templates for particle picking using
FindEM. A total of 936,200 particles were selected from the two datasets and after several
rounds of 2D classifications 144,085 particles were retained for further processing. These
particles were refined against the Pol II EC-Rad26 structure using the heterogeneous
refinement regime in CryoSparc
52
. 51,119 particles contributed to the final structure.
Analysis of the angular distribution of the particles revealed a strong orientation bias
(Extended Data Fig. 4e) preventing further classification or refinement of the structure.
Model building
Rosetta
53
55
was used to build atomic models into the cryo-EM maps of Pol II EC-Rad26
and Pol II EC. Reference models for Rad26 and different subunits of Pol II were selected
based on homology detection using Hidden Markov Model as implemented in HHpred
56
.
(Note: the model for yeast Snf2 protein from a recent cryo-EM structure of the Snf2-
nucleosome complex (PDB ID 5X0Y)
22
, which we compare with the Rad26 structure in
Figure 4 and Extended Data Figure 8, was not included in our model building.) Reference
models were aligned to the EM map using Chimera. Twenty top-scoring homologous
models were used as input in RosettaCM, which then rebuilt missing regions guided by
density, and refined the resulting structures using the Rosetta force field augmented with fit-
to-density energy
54
. The starting model for the transcription scaffold was generated from
previous structures of the transcribing yeast (PDB ID: 5C4X)
57
and mammalian (PDB ID:
5FLM) Pol II
28
, and was refined using PHENIX real space refinement
58
with nucleic acids-
specific LibG restrains. Initially the Pol II EC-Rad26 map was divided into three regions:
Rad26, Pol II core and Rpb4/7 stalk and the atomic models for each of these regions were
Xu et al.
Page 9
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
refined separately. In each case several output models were generated (2,000 for Rad26 and
320 each for Pol II core and Rpb4/7 stalk). For each region, the conformation with the best
Rosetta energy (including fit-to-density energy) was used for subsequent steps. Regions
poorly matching the density following RosettaCM were manually deleted from the
templates, and rebuilt
de novo
.
Once a converged solution was arrived at for all three regions (Rad26, Pol II core and
Rpb4/7 stalk), a combined atomic model was refined against the complete EM map of Pol II
EC-Rad26. This was carried out using RosettaCM, where 200 models were generated. A
final refinement step was performed against one of the half-maps from the 4.5 Å structure of
Pol II EC-Rad26 (the “training half-map”) using Rosetta’s Relax protocol
53
,
55
to optimize
the positions and geometry of the amino acid side chains. Model geometry was verified
using Molprobity
59
(Extended Data Fig. 3g). To estimate over-fitting, FSC
work
(FSC curve
between the refined model and training half-map) and FSC
free
(FSC curve between the
refined model and the other half-map, the “test half-map”) were calculated. All figures and
difference maps were generated using UCSF Chimera
60
and the maps were segmented using
Seggar
61
as implemented in UCSF Chimera.
Data availability
Cryo-EM maps have been deposited in the EM Data Bank, with the following accession
numbers: higher-resolution (4.5 Å) Pol II EC-Rad26 (EMDB: EMD-8736); final map for Pol
II EC-Rad26 (EMDB: EMD-8735); Locally filtered version of EMDB-8735 (EMDB:
EMDB-7038). Pol II EC (EMDB: EMD-8737); and Pol II EC (CPD) (EMDB: EMD-8885).
The atomic models have been deposited in the Protein Data Bank, with the following
accession numbers: Pol II EC-Rad26 complex (PDB ID: 5VVR); and Pol II EC complex
(PDB ID: 5VVS).
Xu et al.
Page 10
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
Extended Data
Extended Data Figure 1. Sequence alignment of the ATPase core domains of CSB family
members
Protein sequences from the CSB ATPase core region from
S. cerevisiae
,
S. pombe
,
A.
thaliana
,
D. rerio
,
M. musculus
and
H. sapiens
were aligned using Clustal Omega. Residues
are numbered based on the sequence of the
S.c.
CSB ortholog (Rad26). Conserved residues
are highlighted in red and helicase-specific motifs are boxed in black and labeled with
roman numerals. The flexible disordered loop regions that were not built into the cryo-EM
density are indicated, as are the SWI2/SNF2-specific domains HD1 and HD2.
Xu et al.
Page 11
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
Extended Data Figure 2. Cryo-EM reconstructions of the Pol II EC-Rad26 and Pol II EC
complexes
a
, Representative micrograph of Pol II EC-Rad26 complexes. The scale bar represents 100
nm.
b
, Power spectrum of the micrograph in
a
showing Thon rings out to 3.4 Å.
c
,
Representative 2D class averages of the Pol II EC-Rad26 complex.
d
, Schematic
representation of the strategy used to sort out the data sets into Pol II EC and Pol II EC-
Rad26 complex structures. Unless otherwise noted, 3D classification was performed without
image alignment. Colored, segmented maps indicate those classes whose particles were used
Xu et al.
Page 12
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
for further processing. The color scheme used in the segmented maps is as follows: grey: Pol
II, orange: Rad26, green: transcription scaffold. Black lines follow the classification scheme
used to extract homogeneous Pol II EC-Rad26 particles; blue lines follow the classification
scheme used to extract homogeneous Pol II EC particles. The refined maps for the higher-
resolution Pol II EC-Rad26 complex (with fragmented Rad26 density), final Pol II EC-
Rad26 complex and Pol II EC are highlighted with green, black and blue boxes, respectively.
The indicated resolution corresponds to the 0.143 FSC based on gold standard FSC curves.
The number of particles contributing to each selected structure are indicated. The
percentages shown are relative to the total number of particles selected after 2D
classification.
e, f, g
, Front and back views of locally-filtered maps colored by local
resolution.
h
, Euler angle distribution of particle images for the maps shown in
e-g
.
i
,
Fourier Shell Correlation (FSC) plots for the higher-resolution Pol II EC-Rad26 complex
(with fragmented Rad26 density), final Pol II EC-Rad26 complex and Pol II EC maps with
the resolution at 0.143 FSC indicated.
j
, Representative near-atomic resolution regions in Pol
II from the locally-filtered higher-resolution (4.5 Å) Pol II EC-Rad26 map. The density is
shown in transparent grey with the atomic model for Pol II EC-Rad26 complex fitted in the
map. The
β
-sheet corresponds to residues 346-356, 440-446, and 486-493 in Rpb1, and
1104-1107 in Rpb2. The portion of the bridge helix shown here corresponds to residues
810-829 in Rpb1.
Xu et al.
Page 13
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
Extended Data Figure 3. Three-dimensional classification of Pol II EC-Rad26 complex data and
Rosetta model validation
a
, Table summarizing the main statistics from data collection, refinement and model
validation.
b, c
, Root mean square deviation (RMSD) of the protein backbones among the
top five conformations (based on Rosetta energy) of Pol II EC-Rad26 complex (
b
) and Pol II
EC (
c
) generated by RosettaCM. In both cases the best Rosetta energy model is shown as a
worm model, with thickness and color representing the backbone RMSD. The transcription
scaffolds were not included in the RMSD calculation and were omitted for clarity.
d
,
Xu et al.
Page 14
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
Backbone RMSD between the atomic models of Pol II EC-Rad26 complex and Pol II EC
shown on the atomic model of Pol II EC-Rad26 complex using the same representation used
in
b
and
c
. The models were globally aligned to each other in Chimera (UCSF) and only
those parts of the model for which RMSD calculation could be performed are shown.
e, f
,
FSC curves between the atomic model and cryo-EM maps for Pol II EC-Rad26 complex (
e
)
and Pol II EC (
f
). In
e
FSC
work
and FSC
free
were calculated using half maps from the
higher-resolution Pol II EC-Rad26 complex structure. The 0.5 FSC line is shown.
g
,
MolProbity statistics for the Pol II EC-Rad26 complex and Pol II EC models. RSCC: Real
Space Correlation Coefficient, as implemented in EMRinger
62
. The RSCC value shown in
parentheses for Pol II EC-Rad26 complex is for the higher-resolution (4.5 Å) map with
fragmented Rad26 density.
h, i
, Three different views of the Pol II EC-Rad26 map with
models docked in (
h
), and close-up views of the Pol II-Rad26 interface (
i
).
Xu et al.
Page 15
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
Extended Data Figure 4. Cryo-EM reconstruction of a Pol II EC containing a CPD lesion
a
, Representative micrograph of Pol II EC (CPD).
b
, Power spectrum of the micrograph in
(
a
).
c
, Representative 2D class averages of the Pol II EC (CPD) complex.
d
, Fourier Shell
Correlation (FSC) plot for the final Pol II EC (CPD) map with the resolution at 0.5 FSC
indicated.
e
, Euler angle distribution of particle images.
f
, Table summarizing data collection
statistics.
g-k
, Strategy for generating difference map between Pol II EC-Rad26 and Pol II
EC (CPD). We took the model for the Pol II EC-Rad26 complex (
g
), removed Rad26 (
h
),
and converted the resulting model into a cryo-EM-like density (
i
). From this, we subtracted
Xu et al.
Page 16
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
the Pol II EC (CPD) map (
j
) to obtain the difference map (
k
).
l
, Two views of the Pol II EC
(CPD) map.
m
, Model of the Pol II EC complex after removal of Rad26 (
h
) docked into the
Pol II EC (CPD) map.
n
, Same as in (
m
) with the difference map superimposed.
Extended Data Figure 5. Alignment of the HD2-1 region of CSB and non-CSB members of the
SWI/SNF superfamily of ATPases
The HD2-1 region corresponds to the “wedge” motif in the Pol II EC-Rad26 structure (see
Fig. 3e). See Extended Data Figure 1 for the location of the HD2-1 region within the full
ATPase domain. Residues are colored (according to physicochemical properties) when
conserved in at least half of the sequences.
Xu et al.
Page 17
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
Extended Data Figure 6. EMSA assays reveal the strength of base pairing at the upstream fork
of the transcription bubble, not CPD lesions at downstream fork, affects the interaction of Rad26
with Pol II EC
a
, The sequence of the scaffold used in this study. The nucleotides labeled as XXX and
YYY were varied in these experiments to control the strength of the base pairing at the
upstream fork of the transcription bubble.
b
, Electrophoretic mobility shift assay (EMSA)
between Rad26 and Pol II EC with an AT-rich sequence at the upstream fork of the DNA
bubble.
c
, EMSA between Rad26 and Pol II EC with a GC-rich sequence at the upstream
fork of the DNA bubble.
d
, Quantitation of the assays shown in
b, c
. Data shown as mean
Xu et al.
Page 18
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
and standard deviation (n = 3). P-values: not shown = not significant; * = <0.05; ** = <0.01;
*** = <0.001; **** = <0.0001. Precise p-values shown in Extended Data Table 1.
e
,
Modeled structure of Pol II in complex with the mini-scaffold. Rad26, from the Pol II EC-
Rad26 complex structure, was included as a semi-transparent ribbon diagram to indicate the
lack of interaction between it and the mini-scaffold. Mini-scaffolds that eliminate the
upstream DNA to which Rad26 binds were used to form elongation complexes (mini-ECs)
with Pol II, and the interaction between these mini-ECs and Rad26 was tested using EMSA.
f
, DNA/RNA scaffolds used in this experiment. In order to rule out the possibility that
Rad26 may bind to dsDNA in a non-specific manner, a scaffold with only RNA and TS
(Scaffold 2) was also tested.
g, h
, EMSA with Scaffold 1 (
g
) and Scaffold 2 (
h
) showing
formation of a Pol II mini-EC-Rad26 complex. The experiment was repeated independently
twice with similar results.
i
, Scaffolds with or without a CPD lesion (see Methods for
details) were used to form elongation complexes with Pol II, and the interaction between
them and Rad26 was tested using EMSA.
j
, Quantitation of data in (
i
). Data shown as mean
and standard deviation (n = 3). All biochemical experiments were repeated independently 3
times with similar results, except 2 times for
g
and
h.
For gel source data, see
Supplementary Fig. 1.
Extended Data Figure 7. Overlap between the binding sites of Rad26 and Spt4/Spt5 on Pol II
a, c
, Structure of the Pol II EC-Rad26 complex with Rad26 and the DNA/RNA scaffold
shown in surface representation.
b, d
, Structure of Pol II EC bound to Spt4/Spt5 and TFIIS
(PDB ID: 5XON) with Spt4 and Spt5 shown in surface representation. The different
domains of Spt5 are indicated.
e
, Rad26 and the DNA/RNA scaffold from (
a
) are
superimposed on Spt4/Spt5 from (
b
).
f
, Rotated view of (
e
).
g
, Rad26 and the DNA/RNA
scaffold from (
c
) are superimposed on Spt4/Spt5 from (
d
).
h
, Rotated view of (
g
). The
bicolor arrows indicate clashes between Rad26 or the DNA/RNA scaffold and Spt4/Spt5.
Xu et al.
Page 19
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
Extended Data Figure 8. Alignment between Snf2 and Rad26
a
, This panel is identical to Fig. 4b and is included here as a reference.
b
, Superposition
between Rad26, bound to the transcription scaffold, and Snf2 from the cryo-EM structure of
the Snf2-nucleosome complex (PDB ID: 5X0Y), with the nucleosome included in the
image. This is the same alignment shown in Fig. 4a-c and panel
a
, and was driven
exclusively by Snf2 and Rad26. This view is rotated by 180° about the vertical axis relative
to (
a
). The dashed box marks the portion of the structure equivalent to that shown in (
a
). The
back gyre of the nucleosome was faded out for clarity.
c
, Same view as in (
b
) with Snf2 and
Rad26 removed to illustrate the superposition of the Rad26-bound portion of the
transcription scaffold and the nucleosomal DNA.
d-g
, Alignment of Rad26 and Snf2. The
superimposed structures are shown in two orientations (
d
,
f
), with (
d
) corresponding to the
direction indicated by the symbol in (
a
). A worm model is used to represent the similarity
between the two structures (
e
,
g
), with thickness and color indicating the backbone RMSD.
The thin wire corresponds to regions in the Rad26 model that are not present in Snf2.
Xu et al.
Page 20
Nature
. Author manuscript; available in PMC 2018 May 22.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript