Articles
https://doi.org/10.1038/s41564-021-01039-y
1
Division of Geological and Planetary Sciences, California Institute of Technology, Pasadena, CA, USA.
2
Division of Biology and Biological Engineering,
California Institute of Technology, Pasadena, CA, USA.
3
Department of Earth Sciences, University of Southern California, Los Angeles, CA, USA.
4
Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA.
✉
e-mail: wu.fa.bai@gmail.com; vorphan@gps.caltech.edu
T
o chronicle the emergence of evolutionary innovation is a
long-standing pursuit in biology. Due to scant record of reli
-
able microscale fossils, resolving evolutionary history at the
cellular scale relies primarily on molecular comparisons across
present-day life, provided that phylogenetic relatives can be well
delineated. Culture-independent metagenomics has substantially
expanded our access to the Earth’s diverse biomes
1
, including lin
-
eages carrying genetic imprints of critical evolutionary events
through deep time. The Heimdallarchaeota, previously referred to
as the ancient archaea group (AAG)
2
, are one such group and the
closest known relative of eukaryotes as suggested by phylogenom
-
ics
3–5
. Heimdallarchaeotes and their related lineages collectively
called the Asgard archaea contain a sizeable repertoire of eukary
-
otic signature proteins (ESPs)
3,6,7
. However, the genetic make-up
of Heimdallarchaeotes has so far only been inferred from a few
metagenome-assembled genomes (MAGs), which are fragmented
and suffer from uncertainty in their completeness and accuracy
3,7–12
.
Mobile (genetic) elements, including transposons, viruses and plas
-
mids, which are known to play dominant roles in evolution
13
, are
frequently misassembled, omitted or misassigned during MAG
assembly and binning
14
. These drawbacks propagate into uncertain
-
ties in the resolution of archaeal lineages related to eukaryotes and
can obscure the drivers of evolutionary crosstalk and divergence
between eukaryotes and their prokaryotic relatives.
Results
Circular Heimdallarchaeota genomes.
Recovering contiguous
genomes from environmental samples is notoriously challenging
due to their enormous biodiversity and strain-level hetero
-
geneity, while most known lineages have been hard to isolate
due to their unresolved metabolism and/or poorly understood
partner-dependent growth. We overcame these limitations by com
-
bining cultivation methods with molecular community profiling to
progressively dissect environmental microbial enrichment cultures
where a clonal expansion of our species of interest was accompa
-
nied by a reduction in diversity (Extended Data Fig. 1 and Methods).
Using anaerobic cultivation methods, we enriched a member of the
Heimdallarchaeota AAG clade from a barite-rich rock retrieved in
2017 from the Auka hydrothermal vent field (23° 57
′
N, 108° 51
′
W) located in the southern Pescadero Basin near the southern tip of
the Gulf of California at a water depth of 3,674
m (ref.
15
). While ini
-
tially below detection, this rock-associated AAG phylotype emerged
at 1–4% of the 16S ribosomal RNA gene relative abundance in 3
lactate-supplemented, anaerobic enrichment cultures incubated at
40
°C after 7 months (Extended Data Fig. 1, Supplementary Tables 1–
3 and Supplementary Note 1). In an independent set of enrichments
inoculated with sediments collected from the Auka site in 2018 (23°
53
′
N, 108° 48
′
W), alkane-supplemented anaerobic incubations at
37
°C additionally yielded a second AAG phylotype that increased in
16S rRNA gene relative abundance from 0.03 to 4–7% after 9 months
(Supplementary Tables 4 and 5 and Supplementary Note 1).
De novo assembly
16–18
of Nanopore long-read and Illumina
paired-end sequencing of genomic DNA recovered from these
enrichments (Supplementary Table 6) resulted in complete circular
-
ized genomes of the two AAG species from the barite and sediment
enrichment cultures, with genome sizes of 3.32 and 3.08 million
Unique mobile elements and scalable gene flow at
the prokaryote–eukaryote boundary revealed by
circularized Asgard archaea genomes
Fabai Wu
1,2
✉
, Daan R. Speth
1,2
, Alon Philosof
1
, Antoine Crémière
1
, Aditi Narayanan
2
,
Roman A. Barco
3
, Stephanie A. Connon
1
, Jan P. Amend
3,4
, Igor A. Antoshechkin
2
and
Victoria J. Orphan
1,2
✉
Eukaryotic genomes are known to have garnered innovations from both archaeal and bacterial domains but the sequence of
events that led to the complex gene repertoire of eukaryotes is largely unresolved. Here, through the enrichment of hydro-
thermal vent microorganisms, we recovered two circularized genomes of
Heimdallarchaeum
species that belong to an Asgard
archaea clade phylogenetically closest to eukaryotes. These genomes reveal diverse mobile elements, including an integrative
viral genome that bidirectionally replicates in a circular form and aloposons, transposons that encode the 5,000 amino acid-sized
proteins
Otus
and
Ephialtes
. Heimdallaechaeal mobile elements have garnered various genes from bacteria and bacteriophages,
likely playing a role in shuffling functions across domains. The number of archaea- and bacteria-related genes follow strik
-
ingly different scaling laws in Asgard archaea, exhibiting a genome size-dependent ratio and a functional division resembling
the bacteria- and archaea-derived gene repertoire across eukaryotes. Bacterial gene import has thus likely been a continuous
process unaltered by eukaryogenesis and scaled up through genome expansion. Our data further highlight the importance of
viewing eukaryogenesis in a pan-Asgard context, which led to the proposal of a conceptual framework, that is, the Heimdall
nucleation–decentralized innovation–hierarchical import model that accounts for the emergence of eukaryotic complexity.
NA
tURe MICROBIOl
Ogy
| VOL 7 |
FEBr
UAry 2022 | 200–212 |
www.nature.com/naturemicrobiology
200
Articles
NATurE MicrOBiOlOgy
base pairs (Mbp), respectively. The two circular AAG genomes
showed 82% alignment fraction, 88% average nucleotide identity
(ANI), 90% amino acid identity (AAI) and 97.9% 16S rRNA identity
(Supplementary Table 7), which demarcate a clear species bound
-
ary
19
within the same genus
20
. Thus, we propose the species names
Candidatus Heimdallarchaeum endolithica
PR6 (endo- (Greek),
within; lithos (Greek), rock) and
Candidatus Heimdallarchaeum
aukensis
PM71 (Auka, the local vent field) denoting their environ
-
mental origins (Fig. 1a).
Taxonomy and metabolism.
The taxonomy of Asgard archaea
is yet to reach consensus. The initial Heimdallarchaeota
3
, despite
remaining monophyletic in all phylogenomic analyses, was pro
-
posed to either split into four phyla (Heimdall-, Gerd-, Kari-,
Hodarchaeota)
7
or alternatively grouped under a single order
named the Heimdallarchaeia
21
. In this study, we collectively refer to
them as ‘the Heimdall group’. Phylogenomic analyses based on 76
concatenated ribosomal proteins show that the
Heimdallarchaeum
spp. constitute a deeper-branching clade related to the previously
described MAG AB_125 (ref.
3
), well placed under ‘
Heimdall
’ in all
proposed classification strategies (Fig. 1b and Extended Data Fig.
2). Additionally, we also identified a fragmented MAG B53_G16
22
(299 contigs, 1.67
Mbp, approximately 50% complete) from the
Guaymas Basin, formerly assigned under the Pacearchaeota, which
we now designate as a strain of
Ca. H. endolithica
, with an average
ANI of 97.5% compared with our PR6 strain.
Ca. Heimdallarchaeum
spp. are predicted to garner energy by
anaerobically oxidizing organic substrates via processes involving a
partial tricarboxylic acid (TCA) cycle and, given the absence of dis
-
cernible terminal electron accepting pathways, dissipating electrons
via H
2
production (Extended Data Fig. 3a). They each encode one
membrane-bound hydrogenase (MBH) complex and two cytosolic
sulfhydrogenase complexes (SHYI and SHYII) (Fig.
1c). Hydrogen
has been hypothesized to act as a syntrophic intermediate bridging
archaea and bacteria before the engulfment of mitochondrial ances
-
tor by an (Asgard) archaeal ancestor of eukaryotes
4,23–25
. Indeed, in
the recent description of
Ca. Prometheoarchaeum syntrophicum
,
MBH associated with unusual membrane extensions were hypoth
-
esized to facilitate cell–cell contact and hydrogen exchange with
syntrophic partner bacteria
23
. Following from this concept, we pos
-
tulate that cytosolic hydrogen generation by SHY, as found in the
Ca. Heimdallarchaeum
spp., could impose a selective advantage for
a hydrogen-dependent endosymbiotic strategy (Fig. 1c).
Eukaryotic signatures.
One of the many challenges of resolv
-
ing the relationship between archaea and eukaryotes is the cura
-
tion of representative, high-quality genomes across lineages at
their interface. To this end, we verified the complete marker gene
Ca. H. aukensis
Gerdarchaeote AC18
Ca. H. repetitus
FW102
Thorarchaeote BC
Odinarchaeote LCB_4
Ca. H. endolithica
100
91
100
100
94
100
61
100
86
100
83
0.1
Eukaryota
TACK superphylum
Ca. P. syntrophicum
Thorarchaeote FW25
a
The Heimdall group
Size (Mbp)
Clade marker
Percentage coverage
Asgard marker
Percentage coverage
No. contigs
Percentage redundancy
Percentage redundancy
3.3
1
100
2
100
0
3.1
1
100
2
100
0
3.2
11
99.
3
5
NA
4.4
2
99.
3
2
100
1
4.4
1
99.
3
6
100
4
3.1
15
99.
3
3
100
0
3.2
19
99.
3
3
100
0
1.5
9
100
1
NA
Mixed transfer
Planktonic transfer
Anaerobic culture enrichment
Crushed
d
b
ASG140 Heimdallarchaeia
ASG250 GCA002505645
ASG135 Heimdallarchaeia
ASG268 GCA004376735
Heimdallarchaeota AB_125
ASG051 Heimdallarchaeota
ASG300 GCA004524685
ASG052 Heimdallarchaeota
ASG053 Heimdallarchaeota
ASG050 Heimdallarchaeota
ASG132 Heimdallarchaeia
1.0
SHYI
frxA’
frxA’’
SHYII
hypEF
nuoH
nuoJKL
mnhD
HycI
mbhJHL
InlA
MBH
nuoC
fprA
hypD
dfxAhypC
O
2
H
2
O
H
2
O2
H
2
Organic
carbon
MBH
SHYI
SHYII
Endosymbiosis
Syntrophy
c
Ca. H. aukensis
PM71
Ca. H. endolithica
PR6
LC-2/Kariarchaeota
Gerdarchaeota
LC-3/Hodarchaeota
0
1
2
3+
The Heimdall group
TCA cycle
α
-KG
OAA
α
-KG
Ester-linked
lipid
synthesis
GlpQ
PlsY
PlsC
GlpK
VPS4
VPS22
VPS25/SNF8
Actin families
Profilin
Gelsolin
OST3/OST6
STT3
Ribophorin
LC7/dynein
Small GTPases
Eukaryotic signature proteins
Hydrothermal
vent barite
β
’
β
’’
γ
’
γ
’ ’
δ
’
δ
’’
α
’
α
’ ’
Fig. 1 |
Complete genomes of
Ca. Heimdallarchaeum
spp. provide insights for eukaryogenesis.
a
, Illustration depicting the enrichment procedure of
a microbial community associated with a barite-rich rock no. NA091-45r retrieved from the southern Pescadero Basin Auka hydrothermal vent field
at a water depth of 3,700 m. Successive transfers of rock and media (mixed) retained the
Ca. H. endolithica
while lactate-supplemented enrichment
media alone (planktonic) did not. A similar strategy was used to enrich for
Ca. H. aukensis
from the nearby sediment, substituting alkanes for lactate.
b
, Maximum-likelihood phylogeny of 57 Heimdall group Asgard archaea based on 76 concatenated archaeal marker genes. The two circular genomes
of
Ca. Heimdallarchaeum
spp. are highlighted in purple. AB_125 in bold is a MAG initially described that represents the clade.
c
, A schematic illustration
depicting cytoplasmic SHy and MBH operons encoded by
Ca. Heimdallarchaeum
spp. (top) and their hypothetical roles in hydrogen-based syntrophy
during eukaryogenesis (bottom). For SHy operons, the four required subunits are followed by a maturation protease. For MBH operon, the electron
transport genes are in blue and the maturation factors in purple. The rectangle depicts an ancient archaeon related to the
Ca. Heimdallarchaeum
; the kidney
shapes depict ancient bacteria that may have formed syntrophic relations with the archaeon extracellularly or intracellularly and ultimately evolved into
mitochondria.
d
, Maximum-likelihood phylogeny of Asgard archaea representatives based on a concatenation of 56 archaea–eukaryote markers from
40 genomes showing the relationship with eukaryotes followed by select genome characteristics, marker gene coverage and the presence/absence of
genes encoding TCA cycle enzymes, eukaryotic signature proteins and ester-linked lipid synthesis. The genomes constructed in this study are coloured
purple, with the circularized genomes indicated in bold italic. Presence/absence and gene copy number are colour-coded.
α
-KG,
α
-ketoglutarate; NA, not
applicable; OAA, oxaloacetate. For
b
and
d
, A list of genomes and markers can be found in Supplementary Tables 8, 16 and 17.
NA
tURe MICROBIOl
Ogy
| VOL 7 |
FEBr
UAry 2022 | 200–212 |
www.nature.com/naturemicrobiology
201