www.sciencemag.org/content/
355/
6332/
1436/suppl/DC1
Supplementary
Material
s for
On the origins of oxygenic photosynthesis and aerobic respiration in
Cyanobacteria
Rochelle M.
Soo, James
Hemp
, Donovan H. Parks
, Woodward W. Fischer
,* Philip
Hugenholtz*
*Corresponding author. Email: p.hugenholtz@uq.edu
.au (P.H.
); wfischer@caltech.edu (W.W.F.)
Published
31
March
2017,
Science
355
, 1436
(2017)
DOI:
10.1126/science.aal3794
This PDF file includes:
Materials and Methods
Figs. S1 to S
7
Tables S1 and S2
References
2
Materials and Methods
Data reporting
No statistical methods were used
to predetermine sample size. T
he experiments were not
randomiz
ed. The investigators were not blinded to allocation during experiments and outcome
assessment.
Datasets of publicly available sequencing reads
Paired
-
end
read sequences from
the following sequ
encing
projects were downloaded from the
NCBI Sequence Read Archive (SRA
)
: SRR636581; SRR944699; SRR845265; ERR636395;
ERR525992; ERR525373; ERR209866; ERR209865; ERR209449; ERR209447; ERR209359;
ERR636350; ERR209439; ERR525787; ERR209450; ERR209669; ERR20
9448; ERR209516;
ERR209606; ERR209863; ERR209350; ERR209864; ERR209110; ERR
525875; ERR688535 and
ERR209608
. Paired
-
end reads
were also downloaded
from
sequencing
projects
available through
the
In
tegrated Microbial Genomes (IMG) database
(26
)
: metagenomes
3
300000574
;
3300003764
;
3300003765
,
and composite assembled genomes 2582580516 and 2582580591.
Metagenome assembly and genome binning
Reads were quality trimmed using Trimmomatic (v.0.32)
(27
)
with a leading/trailing value (Q3) and
sliding window threshold (Q15). Trimmed reads were merged using BBMerge (v5.5) with default
settings (
https://sourceforge.net/projects/bbmap/
). Trimmed paired
-
e
nd reads were assembled using
the CLC Genomics Workbench 7.0 (CLCBio, Qiagen)
de novo
assembly algorithm, using a k
-
mer
size of 63 and a minimum scaffold length of 1000 bp. Sequencing reads were mapped to assembled
scaffolds using BamM (v1.3.8) (
http://ecogenomics.github.io/BamM/
) which makes use of
BWA
(v0.7.12)
(28
)
for read mapping
. Population genome bins were recovered using MetaBAT (v.0.25.4)
(29
)
with all settings (sensitive, specific, very sensitive
and very specific) and the most comp
lete
bins based on the presence/
absence of 104 bacterial single copy marker genes as defined by
CheckM
(v.1.0.3)
(30
)
were
used
for
further
analysis.
RefineM
(v.0.0.3)
(
https://github.com/dparks1134/RefineM
) was used to improve completeness and check for
contamination of the population
genomes
.
Concatenated protein tree
A concatenated
protein
tree of 120 single copy
marker
genes
(
T
able
S2
)
was constructed from
13,843 genomes obtained from NCBI RefSeq (v75,
March 2016)
, previously assembled
Melainabacteria
genomes, and the population
genomes
extracted from the
SRA and
IMG
in the
present
study. Genes were aligned with HMMER v3.1b1
(31
)
and a m
aximum
-
likelihood tree was
inferred with FastTree (v2.1.7, WAG + GAMMA model, other parameters set to default)
(32
)
.
Bootstrapping (100 times) was performed on a subset of the genomes consisting of all
Cyanobacteria, Synergistetes, Chloroflexi
and Armatimo
na
detes.
Inferred
trees were imported into
ARB
(33
)
for v
isualiz
ation
and exported to Adobe Illustrator for figure production
.
P
utative
representatives of the
Melainabacteria
and
Seri
cytochrom
at
ia
were identified by their relative
position in the tree, and
corroborated where possible by 16S rRNA gene sequences present in the
population genomes
.
Single protein trees
Complex III and IV genes were identified in the
Melainabacteria
and
Seri
cytochrom
at
ia
genomes
using the DOE
-
JGI Microbial Genome Annotation Pipeline (MGAP v.4)
(25
)
, which uses Prodigal
v2.50
(34
)
to identify protein
-
coding gene
s. The protein sequences encoded by these
genes and
functionally characterized
representatives from UniProt were
used as seeds to
identify
homolog
ue
s
in the
NCBI
protein database using Mingle (v 0.0.7
) (
https://github.com/Ecogenomics/mingle
).
Briefly,
matches with
an
E
-
value
≥
1e
-
5
, an amino acid identity
≥
30%, a
nd an alignment length
≥
50% were considered to be homologous. Sets of homologous proteins were
aligned using MAFFT
3
(35
)
(v.7.221)
and
alignments filtered by excluding columns with <50%
representation by
homologous amino acids
. Phylogenetic inference was performed
on the filtered alignments using
FastTree v2.1.7
(32
)
under the WAG + GAMMA model and bootstrapped 100 times using non
-
parametric bootstrapping.
Inferred
trees wer
e imported into ARB for visualiz
ation
and exported to
Adobe Illustrator for figure production
.
Channel prediction
Structural homology models for A
-
family oxygen reductases were built using SwissModel via the
Swiss
-
PdbViewer interface. Proton channel
s
were predicted comparing sequences alignments with
the str
uctural models.
4
Fig. S1
. Concatenated protein tree of the phylum Cyanobacteria.
A maximum likelihood tree of the phylum
Cyanobacteria based on the concatenated alignment of 120 phylogenetically conserved proteins (
T
able
S
2
).
Bootstrap resampling analyses (100 times) with maximum likelihood was performed with FastTree
(32
)
. Taxa in
re
d represent
Melainabacteria
and
Sericytochromatia
genomes obtained in the present study. Black circles
represent interior nodes with
≥
90% bootstrap support, grey circles
≥
70% bootstrap support and white circles
≥
50% bootstrap support.
Order and class
-
level
affiliations of the taxa are shown to the right of the figure.
Outgroups (not shown) used for the analysis belong to the phyla Synergistetes, Chloroflexi and
Armatimonadetes.
MH_37
CAG_729
HUM_20
MEL_C1
ZAG_1
CAG_967
HUM_2
CAG_715
MEL_A1
CAG_768
CAG_439
CAG_484
ZAG_221
HUM_7
HUM_14
CAG_815
HUM_6
HUM_12
HUM_1
CAG_813
ZAG_111
HUM_5
HUM_4
HUM_22
HUM_15
HUM_18
CAG_196
HUM_8
HUM_11
HUM_3
HUM_13
HUM_17
HUM_19
HUM_23
MEL_B1
HUM_10
HUM_16
HUM_21
CAG_306
MEL_B2
HUM_9
ACD_20
Vampirovibrio chlorellavorus
UASB_351
LMEP_10873
LMEP_6097
SSGW_16
WWTP_15
WWTP_8
EBPR_351
RAAC_196
CBMW_12
LSPB_72
Gastranaerophilales
Vampirovibrionales
Caenarcaniphilales
SHAS531-1
V201-46
Obscuribacterales
S15B-MN24
GL2-53
Melainabacteria
Oxyphotobacteria
Sericytochromatia
0.10
90%
70%
50%
Order
Class
5
Fig.
S
2. Cytochrome
bc
complex protein trees.
Maximum
-
likelihood trees of PetB (cytochrome
b
6
/b
), PetC (cytochrome
b
6
f
c
omplex iron
-
sulf
ur subunit/cytochrome
c
1
) and PetD
(cytochrome
b
6
f
complex subunit IV) proteins with 100 bootstrap resamplings. Proteins belonging to Cyanobacteria are bolded in e
ach tree and their gene neighbo
rhood s
hown
nearby, with the ortholog
of interest shown i
n colo
r. Bootstrap support for interior is indicated as
per
Fig.
S
1
. Order and class
-
level assignments of individual cyanobacterial taxa is
shown in the legend at the bottom right of the figure.
Euryarchaeota
Firmicutes
Vampirovibrio chlorellavorus
LMEP_6097
SSGW_16
Obscuribacterales
Vampirovibrio chlorellavorus
LMEP_6097
SSGW_16
Obscuribacterales
Aquificae
Proteobacteria
Desulfovibrionaeota
Deltamicrobia
Ignavibacteriae
Nitrospirae
Nitrospirae
Desulfobulbaceae bacterium BRH
CBMW_12
Bacteroidetes
Acidobacteria
Ignavibacteriae
Chloroflexi
Chlamydiae
CBMW_12
Nitrospirae
Firmicutes
Actinobacteria
Bdellovibrionaeota
Planctomycetes
NC10
Planctomycetes
Nitrospirae
Firmicutes
Obscuribacterales
Syntrophorhabdus aromaticivora
Thermoanaerobaculum aquaticum
LMEP_10873
Obscuribacterales
Chlorobi
Desulfovibrionaeota
Epsilonmicrobia
Proteobacteria
Oxyphotobacteria
0.10
Cytochrome b6f/bc1
PetB (223AA)
B
D
petC
coxB
AC
petB
3
ccoN
O
S
P
petC
B
B
1
ccoN
O
S
P
petC
B
B
2
1
petB
D
Nitrate reductase
K
I
J
H
narG
petB
cydA
petC
B
2
Firmicutes
Actinobacteria
Actinobacteria
Firmicutes
Bacteroidetes
NC10
Chloracidobacterium thermophilum
B
Obscuribacterales
Chlorobi
SSGW_16
Vampirovibrio chlorellavorus
LMEP_6097
Obscuribacterales
Desulfovibrionaeota
Euryarchaeota
Nitrospirae
Firmicutes
Actinobacteria
Planctomycetes
Acidobacteria
Desulfobulbaceae bacterium BRH
CBMW_12
Planctomycetes
Epsilonmicrobia
Actinobacteria
Firmicutes
Nitrospirae
Bdellovibrionaeota
LSPB_72
Gloeobacter kilaueensis
JS1
Desulfovibrionaeota
Planctomycetes
Ca.
Methylomirabilis oxyfera
Clostridiales bacterium PH28
CBMW_12
Chlamydiae
Bdellovibrionaeota
Desulfovibrionaeota
Chloroflexi
Firmicutes
Thermi
Proteobacteria
Thermi
Proteobacteria
Deferribacteres
Actinobacteria
Bacteroidetes
Oxyphotobacteria
Oxyphotobacteria
0.10
C
A
C
B
D
petC
coxB
AC
petC B
petC
cydA
B
petC
ccoN
O
S
petC
B
B
Cytochrome b6f/bc1
PetC (150AA)
Firmicutes
Euryarchaeota
Firmicutes
Proteobacteria
Oxyphotobacteria
Clostridiales bacterium PH28
CBMW_12
&KORURIOH[LEDFWHULXP&63í
Actinobacteria
Desulfovibriona
Planctomycetes
Nitrospirae
NC10
Planctomycetes
0.10
Cytochrome b6f/bc1
PetD (160AA)
petB
D
B
D
petC
LMEP_6097
SHAS531
Genome
Order
Class
Melainabacteria
CBMW_12
SSGW_16
LSPB_72
S15B-MN24
GL2-53
V201-46
Sericytochromatia
Sericytochromatia
Melainabacteria
LMEP_10873
Caenarcaniphilales
Melainabacteria
6
Fig.
S
3. Structure of the proton channels of the A
-
family oxygen reductases in
Sericytochromatia.
D
-
channels are shown in green, and K
-
channels in yellow. The CBMW_12 (A) oxygen reductase has a Y at the top of
the D
-
channel, similar to that from
Thermus thermophilus
. Both of its channels are completely conserved. The
CBMW_12 (B) oxygen reductase has a glutamic acid at the top of the D
-
channel. Both D
-
and K
-
channels are highly
modified (residues in red). The specific mutations imply that while the
active
-
site receives protons from the cytoplasm,
it is decoupled from proton pumping, and thus maximally conserves 1H
+
per e
-
.
7
Fig.
S
4. Alternative complex III protein trees.
Maximum
-
likelihood trees of ActA (alternative complex III molybdopterin oxidoreductase pentaheme c subunit), ActC (alternative comple
x III, protein C) and ActF
(alternative
complex III, protein F) proteins with 100 bootstrap resamplings. Proteins belonging to Cyanobacteria are bolded in e
ach tree and their gene neighbo
rhood s
hown nearby, with the
ortholog
of interest shown in col
o
r. Bootstrap support for interio
r is indicated as per
Fig
.
S
1
. Order and class
-
level assignments of individual cyanobacterial taxa is shown in the
legend at the bottom right of the figure.
Chlorobi
Ignavibacteriae
LSPB_72
Ignavibacterium album
JCM 1651
CBMW_12
Acidobacteria
Bdellovibrionaeota
Bacteroidetes
Bdellovibrionaeota
Desulfovibriona
Spirochaetes
Verrucomicrobia
Bacteroidetes
Bdellovibrionaeota
Bdellovibrionaeota
Acidobacteria
Planctomycetes
Bdellovibrionaeota
Proteobacteria
Thermi
Chloroflexi
Proteobacteria
0.10
Gemmatimonadetes
Desulfovibrionaeota
Proteobacteria
Desulfovibrionaeota
Chloroflexi
Chloroflexi
CBMW_12
Acidobacteria
Bdellovibrionaeota
Chloroflexi
Acidobacteria
Proteobacteria
Ignavibacteriae
Bdellovibrionaeota
Verrucomicrobia
Bdellovibrionaeota
Acidobacteria
Planctomycetes
Bacteroidetes
Bdellovibrionaeota
Spirochaetes
Bdellovibrionaeota
Bacteroidetes
Ignavibacteriae
0.10
p__Acidobacteria
LSPB-72
Melioribacter roseus
30í
Ignavibacterium album
JCM 1651
Chlorobi bacterium OLB4
Zixibacteria
Bacteroidetes
Deltamicrobia
Deltamicrobia
Planctomycetes
Verrucomicrobia
Proteobacteria
Chloroflexi
Acidobacteria
Proteobacteria
Desulfovibrionaeota
Proteobacteria
Leptospiraeota
Deltamicrobia
Bacteroidetes
0.10
Alternative Complex III
ActA (213AA)
actA
B1
B2
C
D
E
F?
actA
B
C
DE
F
G?
Alternative Complex III
ActF (394AA)
actA
B
C
DE
F
G?
Alternative Complex III
ActC (460AA)
actA
B
C
DE
F
G?
actA
B1
B2
C
D
E
F?
&%0:B
/63%B
S15B-MN24
GL2-53
Genome
Order
Class
Sericytochromatia
Sericytochromatia
LSPB_72
8
Fig.
S
5. HCO A
-
family oxygen reductase protein trees.
Maximum
-
likelihood trees of CoxA (cytochr
ome c oxidase subunit I) and CoxB (cytochrome c oxidase subunit II) proteins with 100 bootstrap resamplings. Proteins belongi
ng to
Cyanobacteria are bolded in e
ach tree and their gene neighbo
rhood s
hown nearby, with the ortholog
of interes
t shown in colo
r.
Bootstrap support for interior is indicated as per
Fig.
S
1
. Order and class
-
level assignments of individual cyanobacterial taxa is shown in the legend
at the top right of the figure.
Chloroflexi
Actinobacteria
LSPB_72
Ca.
Entotheonella sp. TSY1
Verrucomicrobia
Gemmatimonadetes
Firmicutes
Chloroflexi
Actinobacteria
Actinobacteria
Firmicutes
Actinobacteria
Oxyphotobacteria
CBMW_12
Ca.
GLYLVLRQ=L[LEDFWHULDEDFWHULXP5%*í
Proteobacteria
Proteobacteria
Proteobacteria
Acidobacteria
Proteobacteria
Blastococcus
saxobsidens
DD2
CBMW_12
Desulfobulbaceae bacterium
BRH_c16a
Geoalkalibacter ferrihydriticus
DSM 17813
Proteobacteria
Planctomycetes
Proteobacteria
Proteobacteria
Proteobacteria
Proteobacteria
0.10
aa3 oxidase
CoxA (529AA)
coxB
AC
petB
coxB
A
C
coxB
A
coxC
B
A
Euryarchaeota
Chloroflexi
Firmicutes
Leptospiraeota
Bacteroidetes
Ignavibacteriae
Desulfovibrionaeota
Bacteroidetes
Planctomycetes
Deltamicrobia
Acidobacteria
Chloroflexi
Proteobacteria
Actinobacteria
Oxyphotobacteria
CBMW_12
Proteobacteria
Thermi
Chloroflexi
Chloroflexi
Truepera radiovictrix
DSM 17093
Ca.
Entotheonella sp. TSY1
LSPB_72
Verrucomicrobia
Firmicutes
Verrucomicrobia
Bacteroidetes
Acidobacteria
Proteobacteria
CBMW_12
Desulfobulbaceae bacterium BRH_c16a
Geoalkalibacter ferrihydriticus
DSM 1781
Proteobacteria
Proteobacteria
Actinobacteria
Proteobacteria
0.10
aa3 oxidase
CoxB (315AA)
coxB
AC
petB
coxB
A
C
coxB
A
C
coxC
B
A
CBMW_12
LSPB_72
S15B-MN24
GL2-53
Genome
Order
Class
Sericytochromatia
Sericytochromatia
9
Fig.
S
6. HCO C
-
family oxygen reductase protein trees.
Maximum
-
likeliho
od trees of CcoN (cytochrome c oxidase,
cbb
3
-
type, subunit I) and CcoO (cytochrome c oxidase,
cbb
3
-
type, subunit II) proteins with 100 bootstrap resamplings.
Proteins belonging to Cyanobacteria are bolded in each tree and th
eir gene neighbo
rhood s
hown nearby, with the ortholog of interest shown in colo
r. Bootstrap support for interior is
indicated as per
Fig. S
1
. Order and class
-
level assignments of individual cyanobacterial taxa is shown in the legend at the bottom left of the figure.
Firmicutes
Desulfovibrionaeota
Bdellovibrionaeota
Obscuribacterales
Vampirovibrio chlorellavorus
SSGW_16
LMEP_6097
Synechococcus
Proteobacteria
Bacteroidetes
Proteobacteria
Proteobacteria
Bacteroidetes
Proteobacteria
Proteobacteria
Proteobacteria
CBMW_12
Deferribacteres
Desulfovibrionaeota
Bacteroidetes
Ignavibacteriae
Bacteroidetes
Bacteroidetes
Bacteroidetes
Bacteroidetes
LSPB_72
Bdellovibrionaeota
Bdellovibrionaeota
Proteobacteria
0.10
Firmicutes
Proteobacteria
Nitrospirae
Euryarchaeota
Bdellovibrionaeota
Desulfovibrionaeota
Obscuribacterales
Vampirovibrio chlorellavorus
SSGW_16
LMEP_6097
Symbiobacterium thermophilum I
Synechococcus
Proteobacteria
Proteobacteria
Bacteroidetes
Proteobacteria
Bacteroidetes
Bdellovibrionaeota
Proteobacteria
Chlamydiae
Proteobacteria
CBMW_12
Ignavibacteriae
Bacteroidetes
LSPB_72
Bdellovibrionaeota
Bdellovibrionaeota
Spirochaetes
Epsilonmicrobia
Chlorobi
Deferribacteres
Bdellovibrionaeota
Desulfovibrionaeota
0.10
LMEP_6097
SHAS531
Genome
Order
Class
Melainabacteria
CBMW_12
SSGW_16
LSPB_72
S15B-MN24
GL2-53
V201-46
Sericytochromatia
Sericytochromatia
Melainabacteria
ccoN
O
S
P
petC
B
B
2
ccoS
NO
P
G
ccoN
O
Q
P
G
H
S
ccoS
NO
P
G
ccoN
O
S
P
petC
B
B
ccoN
O
Q
P
G
H
S
cbb3 oxidase
CcoN (478AA)
cbb3 oxidase
CcoO (204AA)
ccoN
O
ccoN
O
10
Fig.
S
7.
Cytochrome
bd
oxidase protein tree.
Maximum
-
likelihood tree of CydA (cytochrome
bd
oxidase subunit I) and CydB (cytochrome
bd
oxidase subunit II) proteins with 100 bootstrap resamplings. Proteins belonging to
Cyanobacteria are bolded in e
ach tree and thei
r gene neighbo
rhood s
hown nearby, with the ortholog of interest shown in colo
r. Bootstrap support for interior is indicated as per
Fig.
S
1
. Order and class
-
level assignments of individual cyanobacterial taxa is shown in the legend
at the top right of the f
igure.
11
Table
S
1. Summary
statistics
of
the
Melainabacteria
and
Seri
cytochrom
at
ia
genomes used in the
study
Genome
Class
1
Order
2
Genome
size
(Mbp)
Number
of
Scaffolds
Number
of Genes
16S
3
CP
4
(%)
CT
5
(%)
Bioproject
number/SRA
metagenome acc.
Ref
6
CBMW_12
Seri
S15B
-
MN24
3.9
59
3632
F
94.8
0
PRJNA348150
PS
RAAC_196
Seri
S15B
-
MN24
3.3
270
3373
N
91.8
1.7
PRJNA348151
PS
LSPB_72
Seri
GL2
-
53
4.5
563
4772
P
80.3
1.7
PRJNA348152
PS
MEL_A1
Mela
Gastranaero
1.9
1
1879
F
100
0
PRJNA321218
3
6
MEL_B1
Mela
Gastranaero
2.3
21
2269
F
98.3
0
**MEL.B1
3
6
MEL_B2
Mela
Gastranaero
2.3
26
2262
F
100
0
**MEL.B2
3
6
MEL_C1
Mela
Gastranaero
2.1
4
2162
F
100
0
**MEL.C1
3
6
ACD_20
Mela
Gastranaero
2.7
104
2565
N
100
3.5
PRJNA114691
3
6
ZAG_111
Mela
Gastranaero
2.2
65
2313
F
88.9
3.9
PRJNA347484
1
ZAG_221
Mela
Gastranaero
1.8
14
1838
N
89.5
0.9
PRJNA347484
1
MH_37
Mela
Gastranaero
2.1
157
2402
P
88.9
0
PRJNA347487
1
ZAG_1
Mela
Gastranaero
2
322
2194
N
82.9
0.9
PRJNA347484
1
HUM_1
Mela
Gastranaero
1.9
8
1845
N
100
0
PRJNA348149
PS
HUM_2
Mela
Gastranaero
1.9
49
1924
N
100
0
PRJNA348149
PS
HUM_3
Mela
Gastranaero
2.2
48
2175
F
100
0
PRJNA348149
PS
HUM_4
Mela
Gastranaero
2.3
39
2323
N
100
0
PRJNA348149
PS
HUM_5
Mela
Gastranaero
2.3
34
2313
F
100
0
PRJNA348149
PS
HUM_6
Mela
Gastranaero
1.8
52
1861
N
100
0
PRJNA348149
PS
HUM_7
Mela
Gastranaero
1.9
48
1955
N
99.1
0
PRJNA348149
PS
HUM_8
Mela
Gastranaero
2.1
47
2191
N
99.1
2.5
PRJNA348149
PS
HUM_9
Mela
Gastranaero
1.9
20
1692
N
98.3
1.7
PRJNA348149
PS
HUM_10
Mela
Gastranaero
2.5
85
2545
F
98.3
0
PRJNA348149
PS
HUM_11
Mela
Gastranaero
2.3
113
2310
N
97.4
0
PRJNA348149
PS
HUM_12
Mela
Gastranaero
1.9
56
1880
F
97.4
0
PRJNA348149
PS
HUM_13
Mela
Gastranaero
2.1
83
2141
N
96.6
1.2
PRJNA348149
PS
HUM_14
Mela
Gastranaero
1.8
60
1783
N
96.6
0.9
PRJNA348149
PS
HUM_15
Mela
Gastranaero
2.1
49
2146
P
94.8
0
PRJNA348149
PS
HUM_16
Mela
Gastranaero
2
44
2030
N
94.0
0.9
PRJNA348149
PS
HUM_17
Mela
Gastranaero
2.3
259
2254
N
93.1
4.0
PRJNA348149
PS
HUM_18
Mela
Gastranaero
2.7
116
2891
N
93.0
1.7
PRJNA348149
PS
HUM_19
Mela
Gastranaero
2.1
151
2140
N
91.0
3.6
PRJNA348149
PS
HUM_20
Mela
Gastranaero
1.8
237
1886
N
89.6
0
PRJNA348149
PS
HUM_21
Mela
Gastranaero
1.9
225
1975
N
85.2
0.3
PRJNA348149
PS
HUM_22
Mela
Gastranaero
1.9
97
1844
N
87.9
0
PRJNA348149
PS
HUM_23
Mela
Gastranaero
1.9
26
1953
N
81.0
0
PRJNA348149
PS
CAG_196
Mela
Gastranaero
2.1
53
2148
F
100
0
PRJNA221948
PS
CAG_306
Mela
Gastranaero
2.2
73
2197
F
96.6
0
PRJNA222071
PS
CAG_439
Mela
Gastranaero
2.2
90
2367
N
100
0
PRJNA222187
PS
CAG_484
Mela
Gastranaero
2.2
62
2263
P
99.1
0
PRJNA221894
PS
12
CAG_715
Mela
Gastranaero
1.9
40
1902
N
100
0
PRJNA221754
PS
CAG_729
Mela
Gastranaero
2
48
2063
N
98.3
0
PRJNA222200
PS
CAG_768
Mela
Gastranaero
2
18
2034
N
100
0
PRJNA222178
PS
CAG_815
Mela
Gastranaero
1.8
32
1861
N
100
0
PRJNA222202
PS
CAG_813
Mela
Gastranaero
1.9
76
1974
N
98.3
0
PRJNA222206
PS
CAG_967
Mela
Gastranaero
2
42
2081
N
98.3
0
PRJNA221908
PS
EBPR_351
Mela
Obscuribact
5.5
8
4655
F
97.4
2.3
PRJNA347481
1
WWTP_8
Mela
Obscuribact
3.4
5
2855
F
87.9
0.9
*3300003765
PS
WWTP_15
Mela
Obscuribact
4.8
26
4042
F
98.3
0.9
*3300003764
PS
SSGW_16
Mela
V201
-
46
2.3
46
2062
F
100
0
*3300000574
PS
LMEP_10873
Mela
Caenarcani
2.2
154
2116
F
87.8
3.5
PRJNA337808
PS
UASB_351
Mela
Caenarcani
1.8
67
1913
P
84.6
0.9
PRJNA347483
1
LMEP_6097
Mela
SHAS531
1.9
25
1819
F
93.9
0
PRJNA337808
PS
Vampirovibrio
chlorellavorus
Mela
Vampirovibrio
3
28
2844
F
100
1.7
PRJNA278896
3
7
1
Class = Seri (
Sericytochromatia
), Mela (
Melainabacteria
)
2
Order = Gastranaero (
Gastranaerophilales
), Obscuribact (
Obscuribacterales
) Caenarcani (
Caenarcaniphilales
), Vampirovibrio (
Vampirovibrionales
)
3
16S = P (Partial) <1300bp, F (Full) >1300bp, N (None)
4
CP = Estimated completeness based on the presence of
104 single copy genes
(30
)
5
CT = Estimated contamination based on the presence of more than one single copy gene
(30
)
6
Ref = PS (Present study)
* IMG accession
number
/genome not deposited in Genbank
** ggkBase accession
name
/genome not deposited in Genbank
13
Table
S
2.
A list of the
120 bacterial marker genes used for
the concatenated protein tree
inference
Marker ID
Name
Description
Length (aa)
PF02576.12
DUF150
Uncharacterised BCR, YhbC family COG0779
141
PF01025.14
GrpE
GrpE
166
PF03726.9
PNPase
Polyribonucleotide nucleotidyltransferase, RNA binding domain
83
PF00466.15
Ribosomal_L10
Ribosomal protein L10
100
PF00410.14
Ribosomal_S8
Ribosomal protein S8
129
PF00380.14
Ribosomal_S9
Ribosomal protein S9/S16
121
TIGR00006
TIGR00006
16S rRNA
(cytosine(1402)
-
N(4))
-
methyltransferase
310
TIGR00019
prfA
peptide chain release factor 1
361
TIGR00020
prfB
peptide chain release factor 2
365
TIGR00029
S20
ribosomal protein bS20
87
TIGR00043
TIGR00043
rRNA maturation RNase YbeY
111
TIGR00054
TIGR00054
RIP metalloprotease RseP
421
TIGR00059
L17
ribosomal protein bL17
112
TIGR00061
L21
ribosomal protein bL21
101
TIGR00064
ftsY
signal recognition particle
-
docking protein FtsY
279
TIGR00065
ftsZ
cell division protein FtsZ
353
TIGR00082
rbfA
ribosome
-
binding factor A
115
TIGR00083
ribF
riboflavin biosynthesis protein RibF
290
TIGR00084
ruvA
Holliday junction DNA helicase RuvA
192
TIGR00086
smpB
SsrA
-
binding protein
144
TIGR00088
trmD
tRNA (guanine(37)
-
N(1))
-
methyltransferase
233
TIGR00090
rsfS_iojap_ybeB
ribosome silencing factor
99
TIGR00092
TIGR00092
GTP
-
binding protein YchF
368
TIGR00095
TIGR00095
16S rRNA (guanine(966)
-
N(2))
-
methyltransferase RsmD
194
TIGR00115
tig
trigger factor
410
TIGR00116
tsf
translation elongation factor Ts
293
TIGR00138
rsmG_gidB
16S rRNA (guanine(527)
-
N(7))
-
methyltransferase RsmG
183
TIGR00158
L9
ribosomal protein bL9
148
TIGR00166
S6
ribosomal protein bS6
95
TIGR00168
infC
translation initiation factor IF
-
3
165
TIGR00186
rRNA_methyl_3
RNA
methyltransferase, TrmH family, group 3
240
TIGR00194
uvrC
excinuclease ABC subunit C
574
TIGR00250
RNAse_H_YqgF
putative transcription antitermination factor YqgF
130
TIGR00337
PyrG
CTP synthase
526
TIGR00344
alaS
alanine
--
tRNA ligase
847
TIGR00362
DnaA
chromosomal replication initiator protein DnaA
437
TIGR00382
clpX
ATP
-
dependent Clp protease, ATP
-
binding subunit ClpX
414
TIGR00392
ileS
isoleucine
--
tRNA ligase
861
TIGR00396
leuS_bact
leucine
--
tRNA ligase
843
TIGR00398
metG
methionine
--
tRNA
ligase
530
TIGR00414
serS
serine
--
tRNA ligase
418
TIGR00416
sms
DNA repair protein RadA
454
TIGR00420
trmU
tRNA (5
-
methylaminomethyl
-
2
-
thiouridylate)
-
methyltransferase
351
TIGR00431
TruB
tRNA pseudouridine(55) synthase
210
TIGR00435
cysS
cysteine
--
tRNA ligase
466
TIGR00436
era
GTP
-
binding protein Era
270
TIGR00442
hisS
histidine
--
tRNA ligase
406
TIGR00445
mraY
phospho
-
N
-
acetylmuramoyl
-
pentapeptide
-
transferase
321
TIGR00456
argS
arginine
--
tRNA ligase
569
TIGR00459
aspS_bact
aspartate
--
tRNA ligase
586
TIGR00460
fmt
methionyl
-
tRNA formyltransferase
315
TIGR00468
pheS
phenylalanine
--
tRNA ligase, alpha subunit
324
TIGR00472
pheT_bact
phenylalanine
--
tRNA ligase, beta subunit
798
TIGR00487
IF
-
2
translation initiation factor
IF
-
2
587
TIGR00496
frr
ribosome recycling factor
176
TIGR00539
hemN_rel
putative oxygen
-
independent coproporphyrinogen III oxidase
361
TIGR00580
mfd
transcription
-
repair coupling factor
923
TIGR00593
pola
DNA polymerase I
890
TIGR00615
recR
recombination protein RecR
196
TIGR00631
uvrb
excinuclease ABC subunit B
658
TIGR00634
recN
DNA repair protein RecN
563
TIGR00635
ruvB
Holliday junction DNA helicase RuvB
305
TIGR00643
recG
ATP
-
dependent DNA helicase RecG
629
TIGR00663
dnan
DNA
polymerase III, beta subunit
367
TIGR00717
rpsA
ribosomal protein bS1
516
TIGR00755
ksgA
ribosomal RNA small subunit methyltransferase A
256
TIGR00810
secG
preprotein translocase, SecG subunit
73
TIGR00922
nusG
transcription termination/antitermination
factor NusG
172