A Caltech Library Service

Comprehensive evaluation of shotgun metagenomics, amplicon sequencing, and harmonization of these platforms for epidemiological studies

Usyk, Mykhaylo and Peters, Brandilyn A. and Karthikeyan, Smruthi and McDonald, Daniel and Sollecito, Christopher C. and Vazquez-Baeza, Yoshiki and Shaffer, Justin P. and Gellman, Marc D. and Talavera, Gregory A. and Daviglus, Martha L. and Thyagarajan, Bharat and Knight, Rob and Qi, Qibin and Kaplan, Robert and Burk, Robert D. (2023) Comprehensive evaluation of shotgun metagenomics, amplicon sequencing, and harmonization of these platforms for epidemiological studies. Cell Reports Methods, 3 (1). Art. No. 100391. ISSN 2667-2375. PMCID PMC9939430. doi:10.1016/j.crmeth.2022.100391.

[img] PDF - Published Version
Creative Commons Attribution Non-commercial No Derivatives.


Use this Persistent URL to link to this item:


In a large cohort of 1,772 participants from the Hispanic Community Health Study/Study of Latinos with overlapping 16SV4 rRNA gene (bacterial amplicon), ITS1 (fungal amplicon), and shotgun sequencing data, we demonstrate that 16SV4 amplicon sequencing and shotgun metagenomics offer the same level of taxonomic accuracy for bacteria at the genus level even at shallow sequencing depths. In contrast, for fungal taxa, we did not observe meaningful agreements between shotgun and ITS1 amplicon results. Finally, we show that amplicon and shotgun data can be harmonized and pooled to yield larger microbiome datasets with excellent agreement (<1% effect size variance across three independent outcomes) using pooled amplicon/shotgun data compared to pure shotgun metagenomic analysis. Thus, there are multiple approaches to study the microbiome in epidemiological studies, and we provide a demonstration of a powerful pooling approach that will allow researchers to leverage the massive amount of amplicon sequencing data generated over the last two decades.

Item Type:Article
Related URLs:
URLURL TypeDescription CentralArticle Information
Usyk, Mykhaylo0000-0002-0374-3753
Peters, Brandilyn A.0000-0002-1534-2578
Karthikeyan, Smruthi0000-0001-6226-4536
Vazquez-Baeza, Yoshiki0000-0001-6014-2009
Shaffer, Justin P.0000-0002-9371-6336
Gellman, Marc D.0000-0003-4989-8627
Daviglus, Martha L.0000-0002-6791-8727
Thyagarajan, Bharat0000-0001-6968-6985
Knight, Rob0000-0002-0975-9019
Qi, Qibin0000-0002-2687-1758
Burk, Robert D.0000-0002-8376-8458
Alternate Title:Comprehensive Evaluation of Shotgun Metagenomics and 16S rRNA gene and ITS1 Amplicon Sequencing for Epidemiological Studies Using a Multicenter Large Cohort
Additional Information:Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0). The authors gratefully acknowledge the participants of the SOL cohort that chose to provide stool samples that allowed this study to be performed. The Hispanic Community Health Study/Study of Latinos (HCHS/SOL) is a collaborative study supported by contracts from the National Heart, Lung, and Blood Institute (NHLBI) to the University of North Carolina (HHSN268201300001I/N01-HC-65233), University of Miami (HHSN268201300004I/N01-HC-65234), Albert Einstein College of Medicine (HHSN268201300002I/N01-HC-65235), University of Illinois at Chicago –HHSN268201300003I/N01-HC-65236 Northwestern University), and San Diego State University (HHSN268201300005I/N01-HC-65237). The following institutes/centers/offices have contributed to the HCHS/SOL through a transfer of funds to the NHLBI: National Institute on Minority Health and Health Disparities, National Institute on Deafness and Other Communication Disorders, National Institute of Dental and Craniofacial Research, National Institute of Diabetes and Digestive and Kidney Diseases, National Institute of Neurological Disorders and Stroke, NIH Institution-Office of Dietary Supplements. Additional funding for the “Gut Origins of Latino Diabetes” (GOLD) ancillary study to HCHS/SOL was provided by 1R01MD011389-01 from the National Institute on Minority Health and Health Disparities. In addition, RO1DK126698, P30CA013330, P30AI124414, and U01HL146204 also provided support. None of the funding agencies had a role in the design, conduct, interpretation, or reporting of this study. Author contributions. Conceptualization, M.U. and R.D.B; methodology, M.U., S.K., D.M., Y.V.-B., J.P.S., R.Knight., and R.D.B; software, D.M. and R.Knight; formal analysis, M.U.; resources, C.C.S., M.D.G., G.A.T., M.L.D., B.T., Q.Q., and R.Kaplan.; writing – original draft, M.U. and R.D.B; writing – review & editing, M.U., B.A.P., and R.D.B; supervision, R.D.B. The authors declare no competing interests.
Funding AgencyGrant Number
Issue or Number:1
PubMed Central ID:PMC9939430
Record Number:CaltechAUTHORS:20230313-315432000.1
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:119956
Deposited By: George Porter
Deposited On:14 Mar 2023 16:19
Last Modified:14 Mar 2023 19:30

Repository Staff Only: item control page