CABO-16S—a Combined Archaea, Bacteria, Organelle 16S rRNA database framework for amplicon analysis of prokaryotes and eukaryotes in environmental samples
Creators
Abstract
Identification of both prokaryotic and eukaryotic microorganisms in environmental samples is currently challenged by the need for additional sequencing to obtain separate 16S and 18S ribosomal RNA (rRNA) amplicons or the constraints imposed by "universal" primers. Organellar 16S rRNA sequences are amplified and sequenced along with prokaryote 16S rRNA and provide an alternative method to identify eukaryotic microorganisms. CABO-16S combines bacterial and archaeal sequences from the SILVA database with 16S rRNA sequences of plastids and other organelles from the PR2 database to enable identification of all 16S rRNA sequences. Comparison of CABO-16S with SILVA 138.2 results in equivalent taxonomic classification of mock communities and increased classification of diverse environmental samples. In particular, identification of phototrophic eukaryotes in shallow seagrass environments, marine waters, and lake waters was increased. The CABO-16S framework allows users to add custom sequences for further classification of underrepresented clades and can be easily updated with future releases of reference databases. Addition of sequences obtained from Sanger sequencing of methane seep sediments and curated sequences of the polyphyletic SEEP-SRB1 clade resulted in differentiation of syntrophic and non-syntrophic SEEP-SRB1 in hydrothermal vent sediments. CABO-16S highlights the benefit of combining and amending existing training sets when studying microorganisms in diverse environments.
Copyright and License
Acknowledgement
The authors would like to thank all the members of the Orphan lab for useful discussions and Dmitri Bilyk for logo design. We acknowledge members of the Agouron International Geobiology Course from 2021 and 2022 and Alex Sessions in particular for encouraging work at Mono Lake, which initiated some of this research. We would like to thank the Agouron Institute and the Simons Foundation for funding the International Geobiology Course. Finally, we would like to thank the SILVA and PR2 teams for their work creating and updating those databases.
Funding
This work is supported by the National Science Foundation [2126631] to D.U. as part of the Ocean Sciences Postdoctoral Research Fellowship; the Simons Foundation [602126] to E.E. as part of the Postdoctoral Fellowships in Marine Microbial Ecology; and the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research [DE-SC0020373].
Data Availability
Custom sequences, scripts, and other files are hosted permanently on Figshare (https://doi.org/10.6084/m9.figshare.27288090). Future updates to CABO-16S will be made available at https://github.com/emelissa3/CABO-16S.
Supplemental Material
Supplementary data is available at NAR Genomics & Bioinformatics online.
Files
lqaf061.pdf
Files
(2.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:51491c8fe3734272bf20b83afd9dfe53
|
2.0 MB | Preview Download |
Additional details
Identifiers
- PMCID
- PMC12086536
Related works
- Is new version of
- Discussion Paper: 10.1101/2024.10.23.619938 (DOI)
- Is supplemented by
- Dataset: 10.6084/m9.figshare.27288090 (DOI)
- Dataset: https://github.com/emelissa3/CABO-16S (URL)
Funding
- National Science Foundation
- 2126631
- Simons Foundation
- 602126
- United States Department of Energy
- DE-SC0020373
Dates
- Accepted
-
2025-05-11
- Available
-
2025-05-19Published