of 14
Intelligent Resolution: Integrating Cryo-EM with AI-driven
Multi-resolution Simulations to Observe the SARS-CoV-2
Replication-Transcription Machinery in Action
Anda Trifan
1
,
2
, Defne Gorgun
1
,
2
, Zongyi Li
3
, Alexander Brace
1
,
4
, Maxim Zvyagin
1
, Heng Ma
1
,
Austin Clyde
1
,
4
, David Clark
5
, Michael Salim
1
, David J. Hardy
2
, Tom Burnley
6
, Lei Huang
7
, John
McCalpin
7
, Murali Emani
1
, Hyenseung Yoo
1
, Junqi Yin
8
, Aristeidis Tsaris
8
, Vishal Subbiah
9
,
Tanveer Raza
9
, Jessica Liu
9
, Noah Trebesch
2
, Geo
￿
rey Wells
10
, Venkatesh Mysore
5
, Thomas
Gibbs
5
, James Phillips
1
, S. Chakra Chennubhotla
11
, Ian Foster
1
,
4
, Rick Stevens
1
,
4
, Anima
Anandkumar
3
,
5
, Venkatram Vishwanath
1
, John E. Stone
2
, Emad Tajkhorshid
2
, Sarah A.
Harris
12
, Arvind Ramanathan
1
1
Argonne National Laboratory,
2
University of Illinois Urbana-Champaign,
3
California Institute of Technology,
4
University of Chicago,
5
NVIDIA,
6
Science and Technology Facilities Council,
7
Texas Advanced Computing Center,
8
Oak Ridge National Laboratory,
9
Cerebras Inc.,
10
University College of London,
11
University of Pittsburgh,
12
University of Leeds
Joint
￿
rst authors,
Contact authors: s.a.harris@leeds.ac.uk, anima@caltech.edu, venkat@anl.gov, johns@ks.uiuc.edu,
emad@illinois.edu, ramanathana@anl.gov
ABSTRACT
The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2)
replication transcription complex (RTC) is a multi-domain protein
responsible for replicating and transcribing the viral mRNA inside
a human cell. Attacking RTC function with pharmaceutical com-
pounds is a pathway to treating COVID-19. Conventional tools,
e.g., cryo-electron microscopy and all-atom molecular dynamics
(AAMD), do not provide su
￿
ciently high resolution or timescale
to capture important dynamics of this molecular machine. Con-
sequently, we develop an innovative work
￿
ow that bridges the
gap between these resolutions, using mesoscale
￿
uctuating
￿
nite
element analysis (FFEA) continuum simulations and a hierarchy
of AI-methods that continually learn and infer features for main-
taining consistency between AAMD and FFEA simulations. We
leverage a multi-site distributed work
￿
ow manager to orchestrate
AI, FFEA, and AAMD jobs, providing optimal resource utilization
across HPC centers. Our study provides unprecedented access to
study the SARS-CoV-2 RTC machinery, while providing general
capability for AI-enabled multi-resolution simulations at scale.
KEYWORDS
multi-resolution simulations, SARS-CoV-2, COVID19, HPC, AI
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for pro
￿
t or commercial advantage and that copies bear this notice and the full citation
on the
￿
rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior speci
￿
c permission and/or a
fee. Request permissions from permissions@acm.org.
Supercomputing ’21, November 14–19, 2021, St. Louis, MO
©
2020 Association for Computing Machinery.
ACM ISBN ISBN...$15.00
https://doi.org/
￿
nalDOI
ACM Reference Format:
Anda Trifan
1
,
2
, Defne Gorgun
1
,
2
, Zongyi Li
3
, Alexander Brace
1
,
4
, Maxim
Zvyagin
1
, Heng Ma
1
, Austin Clyde
1
,
4
, David Clark
5
, Michael Salim
1
, David
J. Hardy
2
, Tom Burnley
6
, Lei Huang
7
, John McCalpin
7
, Murali Emani
1
,
Hyenseung Yoo
1
, Junqi Yin
8
, Aristeidis Tsaris
8
, Vishal Subbiah
9
, Tanveer
Raza
9
, Jessica Liu
9
, Noah Trebesch
2
, Geo
￿
rey Wells
10
, Venkatesh Mysore
5
,
Thomas Gibbs
5
, James Phillips
1
, S. Chakra Chennubhotla
11
, Ian Foster
1
,
4
,
Rick Stevens
1
,
4
, Anima Anandkumar
3
,
5
, Venkatram Vishwanath
1
, John
E. Stone
2
, Emad Tajkhorshid
2
, Sarah A. Harris
12
, Arvind Ramanathan
1
.
2020. Intelligent Resolution: Integrating Cryo-EM with AI-driven Multi-
resolution Simulations to Observe the SARS-CoV-2 Replication-Transcription
Machinery in Action. In
Supercomputing ’21: International Conference for
High Performance Computing, Networking, Storage, and Analysis.
ACM, New
York, NY, USA, 14 pages. https://doi.org/
￿
nalDOI
1 JUSTIFICATION
We developed an AI-enabled multi-resolution simulation frame-
work for studying complex biomolecular machines by directly inte-
grating experimental data. Our framework sets high-water marks
for AI-driven multi-resolution simulations and achieving high uti-
lization of resources across diverse supercomputing platforms at
multiple sites.
2 PERFORMANCE ATTRIBUTES
Performance Attribute
Our Submission
Category of achievement
Scalability, Time-to-solution
Type of method used
Explicit, Deep Learning
Results reported on the basis of Whole application including I/O
Precision reported
Mixed Precision
System scale
Measured on full system
Measurement mechanism
Hardware performance counters,
Application timers,
Performance Modeling
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted October 12, 2021.
;
https://doi.org/10.1101/2021.10.09.463779
doi:
bioRxiv preprint
Supercomputing ’21, November 14–19, 2021, St. Louis, MO
Trifan et al.
Ensemble Continuum
Simulations (FFEA)
All-atom Ensemble MD Simulations
(NAMD)
Cryo-EM maps
Hierarchical AI Methods for
computational steering
Initial best
guess
Global
conformation
al
fl
uctuations
Atomistic (re
fi
ned)
Cryo-EM models
Atomistically
correct sub-
domain
orientations
Interface
potentials
Novel conformational
states
Localized
fl
uctuations
1
2
3
4
5
6
Protein Data
Bank
Individual protein
“parts”
Figure 1: An integrative biology framework for re
￿
ning
low resolution cryo-EM structures with multi-resolution
simulations. (1) Representing the cryo-EM density map as
a continuum visco-elastic solid. (2) Finite element analy-
sis simulations are then used to generate new conforma-
tions. AI techniques identify interesting events in the land-
scape (global conformational changes), while (3) simultane-
ously constraining them with all-atom simulations derived
protein-protein interface potentials. (4) AI methods are also
used to learn local conformational changes across the molec-
ular machine, such that they can be used to (5) re
￿
ne do-
main orientations in the entire biomolecular complex. (6)
The output represents a set of atomistically re
￿
ned ensem-
ble of structures that captures the conformational
￿
uctua-
tions embodied in the cryo-EM data.
3 OVERVIEW OF THE PROBLEM
The novel coronavirus 2019 (COVID-19) pandemic has led to a
massive acceleration in the pace of development in experimental
structural biology. Multiple research teams have invested their
best resources towards the common goal of characterizing viral
components and their biological functions, thereby providing a
sign-post for the future direction of the
￿
eld (Alam and Higgins,
2020, Bárcena et al., 2021, Barrantes, 2021, Kim and Jung, 2021).
Structural biology data, which provides the basis for rational design
of all new medicines against human and animal disease, is now
inherently multi-modal and multi-scale, requiring novel integrative
approaches (AlQuraishi, 2019, Arantes et al., 2020, Jumper et al.,
2021, Minkyung et al., 2021, Muratov et al., 2021, Padhi et al., 2021,
Tunyasuvunakool et al., 2021, Zimmerman et al., 2020).
While the mechanism of human host cell entry and infection
by SARS-CoV-2 via the spike glycoprotein is now relatively well
characterized (Barros et al., 2020, Shang et al., 2020, Sztain et al.,
2021, Zhang et al., 2021), how the SARS-CoV-2 replicates inside the
host cell is still unclear (Romano et al., 2020). The viral-RNA replica-
tion mechanism is complex, involving RNA synthesis, proofreading,
and capping and is mainly carried out by the mini-replication tran-
scription complex plus error correction machinery (mRTC+ECM
from here on referred to as RTC), having to survive against the
human immune response. Cryo-EM techniques and computational
methods have been immensely helpful in elucidating the overall
structural organization of the RTC (Chen et al., 2020, Perry et al.,
2021, Yan et al., 2020, 2021a), but the high intrinsic
￿
exibility, size
and complexity of the nsp arrangement entails that the overall res-
olution of the data is inherently poor. Consequently, the structure
re
￿
nement work
￿
ows discard 30-40% of collected images from the
existing RTC complex datasets (Chen et al., 2020, Yan et al., 2020,
2021a), leaving signi
￿
cant gaps in our understanding.
Although several studies have focused on disrupting the func-
tion of the individual non-structural proteins (nsps) with small
molecules, key insights into the overall structural organization,
dynamics and function of the RTC machinery are more di
￿
cult to
obtain. This is crucial, because the ability to target protein-protein
interactions between subunits of the RTC complex o
￿
ers far more
possibilities for drug development. Moreover, the arrangement of
individual protein components of the RTC+ECM protein is itself
dynamic during the viral life-cycle. The computational capability to
model these interactions would provide further insight of relevance
to drug development, but is currently impossible without novel
multi-scale models and the work
￿
ows that connect them.
The primary challenge of experimental imaging is elucidating
diverse structural dynamics. This stems from the
averaging
process
of the imaging data: cryo-EM, in particular, and other experimental
techniques capture only the most sampled conformational states
as static, snapshot-like representations, but the intermediates or
transitional states are less represented (Lyumkis, 2019, Merk et al.,
2016). The details of motions within
￿
exible domains can be en-
riched using complimentary tools such as molecular dynamics (MD)
simulations and Bayesian inference techniques (Bowerman et al.,
2017, Bratholm et al., 2015, Cavalli et al., 2007, Grishaev and Llinás,
2005, Scheres, 2012); however, the timescales accessible to these
atomistic simulations can be a limiting factor. In addition, advances
in 4D imaging modalities (Earnest et al., 2017, Engel et al., 2015,
Mahamid et al., 2016, Villa and Lasker, 2014) and the volume of data
generated from such experimental datasets can be overwhelming.
Therefore, in this paper we address the urgent, yet unmet need to
develop scalable computational tools that can aid the improvement
of resolution within cryo-EM datasets through multi-resolution
simulations. In an e
￿
ort to bridge the gap between experimental
and purely all-atom molecular dynamics (AAMD), we leverage
a complementary mesoscale method of representing biophysical
systems, treating biomolecules as visco-elastic continuum solids
using
￿
uctuating
￿
nite element analysis (FFEA) (Oliver et al., 2013)
technique. These continuum-scale lower resolution FFEA simula-
tions provide a generative model for the cryo-EM data. However,
implementing an approach that directly models electron density
information from cryo-EM data requires a radically di
￿
erent way
to model conformational ensembles, one that moves away from
atomistic-resolution towards a
continuum
-representation, where
by the intrinsic resolution of the data can be captured with nodes
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted October 12, 2021.
;
https://doi.org/10.1101/2021.10.09.463779
doi:
bioRxiv preprint
Intelligent Resolution: Integrating Cryo-EM with AI-driven Multi-resolution Simulations to Observe the SARS-CoV-2 Replication-Transcription Machinery in Action
Supercomputing ’21, November 14–19, 2021, St. Louis, MO
90°
Van der Waals interaction energies
Figure 2: Hybrid structure of the FFEA mesh superimposed with the all-atom representation. The all-atom structure of the
RTC dimer is shown as a cartoon (blue) and the FFEA tetrahedral mesh structure determined from the experimental cryo-EM
map is shown as a wireframe. The top inset represents a 90
rotation of the RTC dimer capturing the range of protein-protein
interfaces in the machine. A close-up view of the mesh at the interface between the RdRp (nsp12) and nsp13 reveals the surface
interface potentials inferred from AAMD simulations. Each mesh point is painted with the surface interface potentials, with
darker shades of red indicating higher and lighter shades indicating lower interaction energy in the protein-protein interface.
and meshes, allowing experimentally obtained cryo-EM data to
be used directly in simulations, at time and length-scale orders of
magnitude increased over traditional all-atom methodologies.
We present a radically innovative work
￿
ow, providing the op-
portunity to accelerate conformational sampling (both AAMD and
FFEA) with AI, thus holding much promise in providing mechanis-
tic insights into the dynamics of large biomolecular complexes (Fig.
1). We leverage FFEA and AAMD simulations to iteratively
￿
ll in
the gaps of cryo-EM data by re
￿
ning the pairwise atomic/molecular
interactions while being automatically constrained by the degrees
of freedom embodied in the cryo-EM data. This iterative coarse-
graining approach maintains the
￿
delity between experimentally
observed densities with the detailed potentials from rigorous physics-
based AAMD simulations to learn informative
priors
and e
￿
ectively
guide the conformational search and can result in better
￿
ts to
the experimental observations. We demonstrate our iterative ap-
proach in modeling the complex conformational transitions within
the SARS-CoV-2 RTC to observe how the various nsps coordinate
the proofreading process of the viral-RNA. These conformational
changes can aid in understanding and guiding the design of novel
therapeutics such as molnupiravir, which aim to target speci
￿
c
nsps. (Agostini et al., 2019, Sheahan et al., 2020)
4 CURRENT STATE OF THE ART
4.1 Parallel molecular dynamics
NAMD (Phillips et al., 2005, 2020) has been one of the most utilized
parallel molecular dynamics engines for over two decades, being
cited once every
70 minutes
1
. NAMD uses adaptive, asynchro-
nous, message-driven execution based on Charm++ (Kalé et al., 2019,
Kalé and Zheng, 2013), while e
￿
ciently utilizing GPUs (Phillips
et al., 2008). It contains advanced features giving scientists an exten-
sive set of tools to observe and/or bias their systems using state-of-
the-art simulation methodologies such as collective variables (Fiorin
et al., 2013) and molecular dynamics
￿
exible
￿
tting (MDFF) (Tra-
buco et al., 2009, Vant et al., 2020) modules.
VMD and its psfgen plugin were used to build and analyze the
molecular complexes studied herein. Simulation preparation, visu-
alization, and conventional analysis approaches were performed
using VMD, with extensive use of GPU-accelerated remote visu-
alization resources (Humphrey et al., 1996a, Stone et al., 2013a,b,
2016). VMD incorporates features for processing cryo-EM density
maps to prepare and analyze
md
￿
hybrid
￿
tting simulations as
required in this project (Stone et al., 2014). VMD incorporates a
custom GPU-accelerated ray tracing engine that exploits hardware
accelerated construction of ray tracing BVH acceleration structures,
BVH traversal, and ray-triangle intersections (Sener et al., 2021).
4.2 FFEA: continuum molecular simulations
Recent advances in cryo-EM, and -ET aided scientists to generate
structural and dynamic information of large biomolecules in 3D
volumetric data (Kühlbrandt, 2014). Expanding on this volumetric
shape, FFEA uses a 3D tetrahedral
￿
nite element mesh providing
dynamic insight of the structure alone and in response to interac-
tions with other molecules (Solernou et al., 2018). FFEA is a new
1
https://www.ks.uiuc.edu/Highlights/?section=2021&highlight=2021-09
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted October 12, 2021.
;
https://doi.org/10.1101/2021.10.09.463779
doi:
bioRxiv preprint
Supercomputing ’21, November 14–19, 2021, St. Louis, MO
Trifan et al.
nsp13-1A
nsp14A
nsp10A
nsp14B
nsp10B
nsp9A
nsp12A
nsp8-2A
nsp13-2A
nsp8-1A
A
B
Initial Structure
Final Structure
RNA unwinding
Figure 3: The atomistic representation of a functional RTC monomer (A) with the RNA (red), extended to the active site of the
other nsp14-10 unit of the opposite monomer (nsp14B - orange), and the exit site on the helicase of the same monomer (nsp13-
2A). The mechanism of RNA unwinding using non-equilibrium molecular dynamics simulations starting from the initial
structure as captured in 7EGQ (B-left), to the
￿
nal structure (B-right). The RNA backbone is restrained with an
md
￿
grid
,
represented in a grey mesh, while using
colvars
to pull the strands to the active sites on the protein. This is an unprecedented
simulation of the mechanism of RNA unwinding which can give signi
￿
cant insight into the RTC function.
physics algorithm for simulation of mesoscale biological structures
obtained from cryo-EM and cryo-ET (Solernou et al., 2018). FFEA
treats biomolecules as continuum visco-elastic solids subject to
thermal noise, chosen so as to satisfy the
￿
uctuation-dissipation
theorem (Oliver et al., 2013). To describe interacting proteins, FFEA
includes short-range van der Waals attractive forces and steric re-
pulsion. Therefore, FFEA uses a unique “top-down” rather than the
more conventional “bottom-up” coarse-graining strategy, by intro-
ducing nanoscale thermal
￿
uctuations into macroscopic continuum
mechanics equations.
FFEA also includes functionality to exert external forces on pro-
teins, to connect proteins together with harmonic springs and to
represent conformational changes between distinct protein confor-
mational states, for example between the pre- and post-powerstroke
states of molecular motors (Richardson et al., 2020). FFEA has been
used to successfully model diverse biological systems, including the
rotary ATPase motor (Richardson et al., 2014), axonemal (Richard-
son et al., 2020) and cytoplasmic (Hanson et al., 2021) dynein motors,
and protein antibodies subjected to external forces (van der Heijden
et al., 2020).
4.3 Multiscale biological simulations
Integrating data across diverse spatial, temporal, and functional
scales has played an important role in understanding the role of
molecular interactions in various diseases (Pak and Voth, 2018,
Tozzini, 2010, Walpole et al., 2013, Zhou, 2014). In order to bridge
information across multiple scales, a number of coarse-graining
approaches have been widely employed. These methods have been
used to study individual proteins from the SARS-CoV-2 proteome (De San-
cho et al., 2020, Garay et al., 2021), as well as more complex systems
such as the entire SARS-CoV-2 virion (Yu et al., 2021). We note that
compared to the state-of-the-art approaches, we choose a contin-
uum representation of molecules, one that is more conducible for
accessing much larger length- and time-scales using FFEA. While
￿
nite element methods are used to model biological systems (e.g.,
growth models, bone growth, etc.) (Dong and Skalak, 1992, Kenn-
away and Coen, 2019, Panagiotopoulou et al., 2017), we note that
our collective approach enables tight integration between all-atom
and continuum representations for large biomolecular complexes
such as the SARS-CoV-2 RTC.
While physics-based approaches for coarse-graining are usually
employed to overcome limitations of the space- and time-scales
accessed by traditional molecular simulations, machine learning
techniques are now being regularly employed to adaptive coarse-
grained simulations from all-atom simulations (Durumeric and
Voth, 2019, Husic et al., 2020, Noé, 2020).
4.4 AI enabled adaptive MD simulations
MD simulation provides atomistic details of molecular interactions
and ensued dynamics. It’s however computationally expensive, and
the sampling e
￿
ciency can be severely hindered, when trapped by
local minima in the free energy landscape. Our group has developed
AI-enabled adaptive MD simulations, namely DeepDriveMD (Lee
et al., 2019), which records MD propagation in latent space via
convolutional variational autoencoder (CVAE) (Bhowmik et al.,
2018), and drives adaptive sampling in the high dimensional con-
formational landscape. The AI inference constantly observes the
simulation runs, prunes stagnant ones that are trapped in local
energy minima, and spawns new ones from less sampled conforma-
tions. We have shown that this approach can accelerate sampling
of rare events (for example, in the SARS-CoV-2 Spike opening sim-
ulations (Casalino et al., 2021), protein folding (Lee et al., 2019))
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted October 12, 2021.
;
https://doi.org/10.1101/2021.10.09.463779
doi:
bioRxiv preprint
Intelligent Resolution: Integrating Cryo-EM with AI-driven Multi-resolution Simulations to Observe the SARS-CoV-2 Replication-Transcription Machinery in Action
Supercomputing ’21, November 14–19, 2021, St. Louis, MO
by at least an order of magnitude. When integrated with special-
ized AI-hardware to accelerate the learning, it can provide nearly 4
orders of magnitude speedup (Brace et al., 2021).
4.5 Work
￿
ow infrastructure
Distributed science work
￿
ows must invariably handle data and
control
￿
ow across networks spanning administrative domains.
Challenging barriers to work
￿
ow management frameworks’ adop-
tion in multi-user HPC environments lie in deploying distributed
client/server infrastructures and establishing connectivity among
remote systems. For instance, Fireworks (Jain et al., 2015) and the
RADICAL-Pilot / Ensemble Toolkit (EnTK) (Balasubramanian et al.,
2016, Merzky et al., 2018) are three widely-used WMFs at super-
computing facilities that expose Python APIs to de
￿
ne and submit
directed acyclic graphs (DAGs) of stateful tasks to a database. These
WMFs possess various implementations of a common
pilot job
de-
sign, whereby tasks are e
￿
ciently executed on HPC resources by
a process that synchronizes state with the work
￿
ow database. Be-
cause the database is directly written by user or pilot job clients,
users of these WMF typically deploy and manage their own data-
base servers such as MongoDB servers.
2
These methods are non-
portable and depend on factors such as whether the compute facility
mandates multi-factor authentication (MFA).
In the context of REST interfaces to HPC resources, e
￿
orts such
as the
Superfacility
concept (Enders et al., 2020) envisions a future
of automated experimental and observational science work
￿
ows
linked to national computing resources through Web APIs. Imple-
mentations such as the NERSC Superfacility API
3
and the Swiss
National Supercomputing Centre’s FirecREST API (Cruz and Mar-
tinasso, 2019) expose methods to submit and monitor batch jobs,
move data between remote
￿
lesystems, and check system availabil-
ity. However, these facility services alone do not address work
￿
ow
management or high-throughput execution; instead, they provide
web-interfaced abstractions of the underlying facility, analogous to
modern cloud storage and infrastructure services.
funcX (Chard et al., 2020) is an HPC-oriented instantiation of the
function-as-a-service (FaaS) paradigm, where users invoke Python
functions in remote containers via an API web service. funcX end-
points run on the login nodes of target HPC resources where Globus
Auth is used for authentication and user-endpoint association. The
funcX model is tightly focused on Python functions and therefore
advocates decomposing work
￿
ows into functional building blocks.
Each function executes in a containerized worker process running
on one of the HPC compute nodes. This fails to support a general-
ized model of applications such as simulations with executables that
may or may not leverage containers, per-task remote data depen-
dencies, programmable error- and timeout-handling, and
￿
exible
per-task resource requirements (e.g. tasks may occupy a single core
or specify a multi-node, distributed memory job with some number
of GPU accelerators per MPI rank).
Balsam (Salim et al., 2021) provides a uni
￿
ed API for describ-
ing distributed work
￿
ows, and the user site agents manage the
full lifecycle of data and control
￿
ow. The Balsam service-oriented
architecture shifts administrative burdens away from individual
2
https://www.mongodb.com/
3
https://api.nersc.gov/api/v1.2/
researchers by routing all user, user agent, and pilot job client in-
teractions through a hosted, multi-tenant web service. As Balsam
execution sites communicate with the central service only as HTTP
clients, deployment involves a simple user-space
pip
package in-
stallation on any platform with a modern Python and outbound
internet access. Due to the ubiquity of HTTP data transport, Balsam
works “out of the box” on supercomputers spanning DOE and NSF
facilities against a cloud-hosted Balsam service.
5 INNOVATIONS REALIZED
Our innovations can be brie
￿
y summarized as follows. (1) We imple-
ment an automated work
￿
ow to seamlessly transition from AAMD
to FFEA via an AI-based approach. By optimizing the performance
of AAMD simulations (Sec. 5.1) we achieve optimal performance
on modern multi-GPU-based computing systems. (2) By clustering
the conformations from AAMD simulations using unsupervised
AI methods (Sec. 5.3), we learn the parameters for constraining
the FFEA simulations based on the implicit
￿
uctuations between
protein-protein interfaces inferred from AAMD simulations. We de-
velop a novel AI-approach, namely Graph neural operators (GNO)
(Sec. 5.3.3) to summarize time-dependent conformational changes
observed from ensemble AAMD simulations. (3) To overcome the
bottleneck of training deep learning models on large volumes of
data generated by AAMD simulations, we scaled their performance
on emerging AI-hardware such as the Cerebras CS-2 accelerator as
well as on the Summit supercomputer.
4
(4), Finally, to our knowl-
edge, this is the
￿
rst attempt at executing a single coordinated
work
￿
ow across multiple supercomputing facilities. We achieved
this through the Balsam multisite work
￿
ow manager (Sec. 5.4). The
performance gains enabled by these innovations provide a gen-
eralized, multi-scale computational toolkit for exploring dynamic
biomolecular machines.
5.1 NAMD all-atom MD simulations
5.1.1 GPU-resident NAMD 3 for GPU-accelerated MD simulations.
To achieve previously unrealized performance levels for MD simu-
lations on GPU-accelerated HPC platforms, we have developed a
major new “GPU-resident” computing approach, implemented in
NAMD 3. The GPU-resident computing mode in NAMD 3 employs
GPU acceleration for the most common molecular dynamics force
and energy calculations together with time integration and rigid
bond constraints, allowing sequences of timesteps to be simulated
entirely on the GPU without any signi
￿
cant host–GPU data trans-
fers, thereby exploiting a signi
￿
cantly higher fraction of theoretical
peak GPU performance (Phillips et al., 2020). Most recently, we have
further extended the GPU-resident capabilities of NAMD 3 to enable
strong-scaling of a simulation across multiple GPUs, with partic-
ular emphasis on hardware platforms that incorporate NVLink
connectivity among peer GPUs. NVLink facilitates direct load/store
memory access among peered GPUs with low latency and high
bandwidth data exchange performance levels far beyond what is
currently possible with a conventional HPC network interconnect,
or even host-internal PCIe transfers among GPUs.
The multi-GPU simulation algorithm generally mirrors the tech-
niques used in the conventional distributed-memory builds of NAMD,
4
https://www.olcf.ornl.gov/summit/
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted October 12, 2021.
;
https://doi.org/10.1101/2021.10.09.463779
doi:
bioRxiv preprint
Supercomputing ’21, November 14–19, 2021, St. Louis, MO
Trifan et al.
Post-RNA unwinding
RMSF (Å)
0.9
19.0
FFEA mesh
7EGQ
Pre-RNA unwinding
Figure 4: Root-mean squared
￿
uctuations (RMSF) of the SARS-CoV-2 RTC provide insights are helpful in determining the in-
trinsic concerted motions in the RTC that are implicitly encoded in the experimental data. The structure of the RTC is painted
using the RMSF determined from the ensemble AAMD simulations labeled pre-RNA-unwinding and post-RNA-unwinding
states, the cryo-EM structure (PDB id: 7EGQ; where we use the experimental B-factors to approximate the RMSF) and FFEA
simulations. While the FFEA simulations show larger
￿
uctuations at the interface of nsp10-14 as it interacts with the two nsp13
subunits, both the AAMD simulations capture larger
￿
uctuations in the two nsp13 subunits across each monomer. While this
embodies slightly di
￿
erent rearrangements between the FFEA and AAMD simulations, the relative
￿
uctuations across these
subunits are high within the cryo-EM data indicating the dynamic nature of this assembly.
Table 1: Summary of AAMD simulations.
Simulation
Non-equillibrium Sampling Methods
GFLOPs/step
Total sampling (
`
s)
Equilibration (pre-unwinding)
None
24.9
2.107
RNA unwinding
Extrabonds, Distance colvar, MDFF grid
25.9
0.004
RNA post-unwinding
Atom restraint, MDFF
11.04
1.904
RNA post-unwinding
None
11.04
3.916
except that it is designed to leverage the completely uni
￿
ed vir-
tual memory address space of the GPUs to exchange data directly
through
￿
ne-grained memory load/store operations. NVLink sup-
port for direct
￿
ne-grained memory loads/stores in concert with
atomic increment and atomic compare-swap operations permit e
￿
-
cient implementation of performance-critical multi-GPU reductions
and synchronization of GPUs within algorithm phases. Platforms
providing GPUs with a fully connected topology permit strong
scaling of molecular systems
1M atoms on up to 8 GPUs with
acceptable e
￿
ciency, limited primarily by poor scalability of the
PME full electrostatics algorithm.
While running a simulation per GPU avoids these scaling chal-
lenges, multi-GPU scaling enables simulation over longer timescales.
For systems of
1M atoms, the timescale is typically a limiting
factor to study conformational changes that a protein undergoes,
essential in understanding how a complex functions. Performance
considerations dictate that any advanced sampling method imple-
mented for GPU-resident must avoid frequent data transfers to and
from the host. Many of these features are contributed by external
methodological contributors and have not yet been implemented
to run on GPUs. As such, NAMD 2 and CPU-based resources still
play an important role in the work
￿
ow.
5.1.2 VMD full-time interactive ray tracing.
To facilitate creation of
high-
￿
delity visualizations with high interactivity, VMD has been
extended with a new real-time ray tracing (RTRT) rendering mode
that uses full-time interactive progressive re
￿
nement ray tracing
for rendering of the molecular scenes instead of OpenGL rasteriza-
tion and includes ambient occlusion lighting, shadows, high-quality
transparency, and depth of
￿
eld focal blur. The new RTRT rendering
mode provides substantially more visually informative visualiza-
tions of complex molecular scenes as compared with conventional
OpenGL rasterization, by bringing o
￿
ine-quality rendering fea-
tures to the interactive VMD display window for the
￿
rst time.
When the RTRT rendering mode is active, advanced visualization
features that were formerly only usable for o
￿
ine rendering, or
for static structures, can now be applied to simulation trajectories
while maintaining full interactivity.
5.1.3 RTC System Preparation for all-atom simulations with NAMD.
The SARS-CoV-2 RTC is a multi-subunit structure composed of sev-
eral nsps, including the RNA-dependent RNA polymerase (RdRp,
nsp12), Zinc-bound helicase (HEL, nsp13), the RNA-capping en-
zymes such as the nsp14 (N7-methyltransferase/MTase) and nsp16
(O2-MTase), the proofreading enzyme, namely nsp14, and uridylate-
speci
￿
c endoribonuclease activity (NendoU, nsp15)(Fig. 3A) (Chen
et al., 2020, Perry et al., 2021, Yan et al., 2020, 2021a). Together they
participate in the process of reproducing the viral genome (Wu
et al., 2020). This complex plays an important role in the viral life
cycle, and is therefore an important drug target (Gao et al., 2020,
Wang et al., 2020). Revealing the atomistic details of this complex
is essential for understanding how the viral RNA is processed.
We prepared the structure of the Cryo-EM dimeric form of the
complex 7EGQ.pdb (Yan et al., 2021b). Cations have functional sig-
ni
￿
cance for the RTC, therefore the linking HIS and CYS residues
were mutated to HSE/HSD and CYM respectively, to have the cor-
rect ionization state to coordinate the present 3 Mg
2
+
ions and 26
Zn
2
+
ions. A complementary CYT base was added to the end of two
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted October 12, 2021.
;
https://doi.org/10.1101/2021.10.09.463779
doi:
bioRxiv preprint
Intelligent Resolution: Integrating Cryo-EM with AI-driven Multi-resolution Simulations to Observe the SARS-CoV-2 Replication-Transcription Machinery in Action
Supercomputing ’21, November 14–19, 2021, St. Louis, MO
incomplete RNA strands. Missing loop residues ARG ALA ARG
corresponding to residue numbers 314-316 in the nsp13 subunits
were modeled using
￿￿￿￿￿￿￿￿
tool in OpenMM (Eastman et al.,
2017). The protein subunits were capped with an NTER and CTER
patches. The
￿￿￿￿￿￿
package from VMD (Visual Molecular Dy-
namics) (Humphrey et al., 1996b) was used to assemble the system,
which was then solvated using the
SOLVATE
plugin and neutral-
ized with K
+
and Cl
ions for a concentration of 150 mM KCl using
AUTOIONIZE
. The resulting systems consist of
1.1 M atoms and
box size of
168
220
312 Å
3
. The systems were energy mini-
mized using steepest descent method for 500 steps, followed by a
3 ns equilibration protocol restraining all but the water molecules
using a constant k = 3 kcal mol
1
Å
2
. The system was further
equilibrated restraining only the protein backbone alpha carbon
atoms and nucleic backbone phosphate atoms with a constant of k
= 0.05 kcal mol
1
Å
2
with NAMD2.14 (Phillips et al., 2005) NVT
ensemble using CHARMM36 force
￿
elds for proteins and nucleic
acids (Vanommeslaeghe et al., 2010).
5.1.4 Grid-steered Molecular Dynamics Simulation of RNA Unwind-
ing.
One of the crucial functions of the RTC complex is to proofread
the RNA, which is a process that requires a major conformational
change of the protein complex and unzipping of the RNA. To mimic
this operation, we performed extensive preliminary studies using an
isolated RNA double-helical structure from the aforementioned pre-
pared structure in order to optimize the combination of grid-steered
molecular dynamics (GMD) (Wells et al., 2007) and collective vari-
ables (colvars) (Fiorin et al., 2013) and their respective optimal force
constants to mimic the corkscrew motion of the RNA unwinding. To
speed up these empirical studies, instead of the all-atom representa-
tion, the protein was represented by a molecular dynamics
￿
exible
￿
tting
md
￿
grid map exerting a repulsive
gridForce
(Trabuco et al.,
2009). A variety of
colvars
were employed to drive the extension
of the RNA structure, including
distanceZ
to pull on the strands to
the nsp14-10 active site and the exit point on the nsp13 helicase, as
proposed in previous studies (Yan et al., 2021b). We applied a rota-
tional force by using the
orientationAngle
colvar, while also pulling
on the center of mass using
distance
colvar. The
￿
nal conformation
used for the simulations are a hybrid of the initial double-helical
structure of the RNA strands, with the extended single strands
which have reached their target points within the complex (Fig. 3B).
This structure was obtained by building an attractive 3 dimensional
electron density map manually constructed from the backbone of
the initial RNA position with the
￿
nal extended strands obtained
from the non-equilibrium simulations.
The all-atom simulations to mimic this process were started from
the equilibrated structure described in section 5.1.3. We performed
an initial system minimization for 2000 steps and then simulated
the system for 0.05 ns to allow the RNA to settle into the grid. After
this initial equilibration, the process of unwinding was simulated
by pulling on the strands through this grid, which acts as a tunnel,
using the
distance colvars
, with a harmonic force. A soft restraining
force was applied on the backbone of the nucleic acids. To avoid
distortion of the double-helical shape of the RNA during pulling,
the hydrogen bonds formed by the bases were enforced by virtual
springs through the NAMD extraBonds feature, with a force of 20
kcal/mol/Å
2
.
5.2 FFEA simulations
5.2.1 Meshing and construction of the RTC dimer for FFEA.
Two
representative meshes were constructed which correspond to the
experimental data available; the cryo-EM density map from EMD-
31138, a dimer form of the biologically active RTC, and the atomistic
structure 7EGQ.pdb, which was used in the above NAMD atom-
istic molecular dynamics simulations. In the latter, the monomer
RTC complex was formed from two separate meshes, one from
the cryo-EM density map EMD-23008, corresponding to 7KRO.pdb,
and the other from 7EGQ.pdb for the nsp14-10 complex atomistic
information. These structures were used to generate 4 four inter-
acting meshes; two in each monomer (Fig. 2). Meshes were con-
structed by
￿
rst removing the dust and smoothing the mask using
ChimeraX (Pettersen et al., 2021) . The non-manifold elements
within the surface were corrected with "MeshLab" (Cignoni et al.,
2008), and course-grained with "MeshMixer" (Sommer et al., 2017).
The 3D meshes were then generated by "Gmsh" (Geuzaine and
Remacle, 2009) and made compatible for FFEA simulations with
NetGen (Schöberl, 1997).
Following dust-removal and smoothing in ChimeraX, density for
the single-stranded RNA was clearly visible in EMD-23008. How-
ever, this extremely thin structure is likely to cause instabilities in
FFEA due to the small volumetric elements required to represent it.
Therefore, the surface elements corresponding to RNA density were
deleted, and the resulting surface repaired by
￿
lling with additional
triangular elements in MeshMixer. The physical presence of the
single-stranded RNA was then represented in the FFEA simulations
using a harmonic spring.
5.2.2 Visualizing FFEA trajectories using VMD.
In order to create
informative visualizations that contain multiple simulation modal-
ities VMD was extended with new plugins to enable reading key
FFEA simulation data. The new plugins enable VMD to read FFEA
simulation and modeling data including tetrahedral mesh nodes,
edges, faces, and topology information, as well as information about
surface and interior faces and their properties, springs, and time-
varying mesh node coordinates. The plugin extensions permit VMD
to create visualizations showing the all-atom and FFEA simulations
aligned and superimposed, with physical properties associated with
the atoms, and FFEA nodes and faces available for use in assigning
surface colors for graphical representations of the simulations.
5.3 AI-enabled multi-resolution simulations
We developed a hierarchy of AI methods that would facilitate
embedding the high-dimensional (FFEA and AAMD) simulations
into a low-dimensional manifold capturing both local (e.g., protein
chain/subunit level) and global (e.g., mRTC monomer or dimer level)
￿
uctuations. This required us to carefully consider the balance of
simulation workloads and AI tools. We discuss the AI methods
￿
rst,
and then address the workload balance aspects in Sec. 5.4.
5.3.1 Capturing global conformational transitions with anharmonic
conformational analysis enabled autoencoders (ANCA-AE).
We have
previously developed a hybrid machine learning approach that com-
bines linear dimensionality reduction methods (Ramanathan et al.,
2011) with a non-linear autoencoder. By minimizing the higher
order (linear) correlations in the atomistic
￿
uctuations and then
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted October 12, 2021.
;
https://doi.org/10.1101/2021.10.09.463779
doi:
bioRxiv preprint
Supercomputing ’21, November 14–19, 2021, St. Louis, MO
Trifan et al.
A
B
C
RMSD (Å)
RMSD (Å)
Figure 5: AI methods learn to summarize the time-dependent conformational changes in the FFEA and AAMD simulations.
(A) The latent dimensional representation of the FFEA simulations summarized using the ANCA-AE method shows a clear
separation between low-RMSD (deeper blue shades) and high-RMSD (yellow) states. (B) t-SNE visualization of the latent man-
ifold learned using the graph neural operator (GNO) model on the entire RTC, painted by the system’s overall RMSD value.
This model was trained on the entire 7egq system, comprised of 6650 total residues, with a window size of 10 frames. To o
￿
set
the computational cost in the vast size of system, the model was trained on a small subset of the trajectory containing 2000
frames in order to investigate the method’s viability. The GNO has articulated the time-dependent conformational changes at
the shorter timescales. (C) depicts the conformational changes between a low-RMSD and high-RMSD state learned from the
GNO, capturing the relative motions of the nsp13 domains across the dimer interface.
examining non-linear correlations, we can obtain a succinct de-
scription of how global conformational changes are embodied in a
simulation. This method, titled ANCA-AE (Clyde et al., 2021), can
e
￿
ciently handle large dimensions (e.g., in the case of the mRTC,
the 6,650 C
U
-atoms lead to approximately 20,000 dimensions) and
can e
￿
ciently run on CPUs. We have shown that ANCA-AE can
identify conformational states that share structural and energetic
similarities, while characterizing transitional points in the high
dimensional landscape. As shown in Fig. 5, we observed (for both
FFEA and AAMD, although we only show results from FFEA sim-
ulations) that ANCA-AE extracts biophysically meaningful latent
coordinates. The ANCA-AE method reveals conformational transi-
tions from the FFEA trajectories, capturing the
￿
uctuations in the
interfaces between the nsp10-nsp14 complex and the rest of the
RTC subcomponents.
5.3.2 Describing localized conformational transitions with convolu-
tional variational autoencoders (CVAE).
We also leveraged a deep
learning algorithm, namely CVAE (Bhowmik et al., 2018) to de-
scribe localized conformational changes. The CVAE provides a
compact representation of the high dimensional conformational
landscape and can also capture intermediate/metastable states from
long timescale MD simulations (Bhowmik et al., 2018). It can also be
combined with outlier detection algorithms such as Density-Based
Spatial Clustering of Applications with Noise (DBSCAN) (Ester
et al., 1996) and the Local outlier factor (LOF) (Breunig et al., 2000)
method to identify time points corresponding to rare transition
events in the simulation trajectories. We note that both ANCA-AE
and CVAE can utilize outlier detection algorithms for this pur-
pose. Since we have previously discussed the CVAE model, we only
present scaling results for this model.
However, CVAE is quadratic in time and space complexity (Casalino
et al., 2021) and can be prohibitive to train for the entire mRTC. This
makes the CVAE ideal to examine protein sub-unit
￿
uctuations,
which can be signi
￿
cantly smaller and more tractable for training.
To optimize its performance, we implemented a distributed data
parallel version of CVAE on the Summit supercomputer as well as
a version within a single Cerebras CS-2 deep learning accelerator.
By keeping all compute and memory resources on-silicon, the CS-2
provides, in a single system, orders of magnitude more compute
power, on-chip memory, memory bandwidth, and communication
bandwidth than traditional small-scale processors. This makes it
particularly well-suited for accelerating ML models like the CVAE.
5.3.3 Graph neural operator (GNO) networks for characterizing
time-dependent conformational changes from AAMD simulations.
Additionally, we also implemented the graph neural operator (GNO)
as a fast surrogate model to simulate and predict the molecule
movements from the AAMD simulation. The applications of deep
learning for MD trajectory analysis, prediction and generation have
recently exploded (Hoseini et al., 2021). The GNO model (Li et al.,
2020) can solve partial di
￿
erential equations (PDEs) by generaliz-
ing
￿
nite-dimensional neural networks to the in
￿
nite-dimensional
operator-learning setting. In this work, we apply the GNO for the
￿
rst time to capture the time dependent conformational changes
governed by the system of equations in a traditional molecular
dynamics simulation. As a
￿
rst step, we show that we can rea-
sonably predict protein backbone conformations up to 5 ps (250
MD simulation timesteps of 2 fs) into the future based on a 50 ps
trajectory. This allows us to extract the time-dependent conforma-
tional changes of the molecule and analyze the connection between
AAMD and FFEA simulation. This approach is capable of achieving
an even longer prediction horizon of 500 ps on smaller test systems,
and will be described in a separate manuscript. As shown in Fig. 5B,
the ability to predict the time-dependent conformational changes in
the entire RTC at short time-scales provides a means to accelerate
sampling. The structure of the latent space following the RMSD
transitions, provides insights into the large conformational changes
captured in 5C.
.
CC-BY-NC-ND 4.0 International license
available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint
this version posted October 12, 2021.
;
https://doi.org/10.1101/2021.10.09.463779
doi:
bioRxiv preprint