Published June 2025 | Version Published
Journal Article Open

CABO-16S—a Combined Archaea, Bacteria, Organelle 16S rRNA database framework for amplicon analysis of prokaryotes and eukaryotes in environmental samples

  • 1. ROR icon California Institute of Technology
  • 2. ROR icon University of Nevada, Las Vegas

Abstract

Identification of both prokaryotic and eukaryotic microorganisms in environmental samples is currently challenged by the need for additional sequencing to obtain separate 16S and 18S ribosomal RNA (rRNA) amplicons or the constraints imposed by "universal" primers. Organellar 16S rRNA sequences are amplified and sequenced along with prokaryote 16S rRNA and provide an alternative method to identify eukaryotic microorganisms. CABO-16S combines bacterial and archaeal sequences from the SILVA database with 16S rRNA sequences of plastids and other organelles from the PR2 database to enable identification of all 16S rRNA sequences. Comparison of CABO-16S with SILVA 138.2 results in equivalent taxonomic classification of mock communities and increased classification of diverse environmental samples. In particular, identification of phototrophic eukaryotes in shallow seagrass environments, marine waters, and lake waters was increased. The CABO-16S framework allows users to add custom sequences for further classification of underrepresented clades and can be easily updated with future releases of reference databases. Addition of sequences obtained from Sanger sequencing of methane seep sediments and curated sequences of the polyphyletic SEEP-SRB1 clade resulted in differentiation of syntrophic and non-syntrophic SEEP-SRB1 in hydrothermal vent sediments. CABO-16S highlights the benefit of combining and amending existing training sets when studying microorganisms in diverse environments.

Copyright and License

© The Author(s) 2025. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited.

Acknowledgement

The authors would like to thank all the members of the Orphan lab for useful discussions and Dmitri Bilyk for logo design. We acknowledge members of the Agouron International Geobiology Course from 2021 and 2022 and Alex Sessions in particular for encouraging work at Mono Lake, which initiated some of this research. We would like to thank the Agouron Institute and the Simons Foundation for funding the International Geobiology Course. Finally, we would like to thank the SILVA and PR2 teams for their work creating and updating those databases.

Funding

This work is supported by the National Science Foundation [2126631] to D.U. as part of the Ocean Sciences Postdoctoral Research Fellowship; the Simons Foundation [602126] to E.E. as part of the Postdoctoral Fellowships in Marine Microbial Ecology; and the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research [DE-SC0020373].

Data Availability

Custom sequences, scripts, and other files are hosted permanently on Figshare (https://doi.org/10.6084/m9.figshare.27288090). Future updates to CABO-16S will be made available at https://github.com/emelissa3/CABO-16S.

Supplemental Material

Supplementary data is available at NAR Genomics & Bioinformatics online.

Files

lqaf061.pdf

Files (2.0 MB)

Name Size Download all
md5:51491c8fe3734272bf20b83afd9dfe53
2.0 MB Preview Download

Additional details

Identifiers

Related works

Is new version of
Discussion Paper: 10.1101/2024.10.23.619938 (DOI)
Is supplemented by
Dataset: 10.6084/m9.figshare.27288090 (DOI)
Dataset: https://github.com/emelissa3/CABO-16S (URL)

Funding

National Science Foundation
2126631
Simons Foundation
602126
United States Department of Energy
DE-SC0020373

Dates

Accepted
2025-05-11
Available
2025-05-19
Published

Caltech Custom Metadata

Caltech groups
Division of Biology and Biological Engineering (BBE), Division of Geological and Planetary Sciences (GPS)
Publication Status
Published