A Caltech Library Service

Pseudoalignment for metagenomic read assignment

Schaeffer, L. and Pimentel, H. and Bray, N. and Melsted, P. and Pachter, L. (2017) Pseudoalignment for metagenomic read assignment. Bioinformatics, 33 (14). pp. 2082-2088. ISSN 1367-4803. PMCID PMC5870846. doi:10.1093/bioinformatics/btx106.

[img] PDF - Submitted Version
See Usage Policy.


Use this Persistent URL to link to this item:


Motivation: Read assignment is an important first step in many metagenomic analysis workflows, providing the basis for identification and quantification of species. However ambiguity among the sequences of many strains makes it difficult to assign reads at the lowest level of taxonomy, and reads are typically assigned to taxonomic levels where they are unambiguous. We explore connections between metagenomic read assignment and the quantification of transcripts from RNA-Seq data in order to develop novel methods for rapid and accurate quantification of metagenomic strains. Results: We find that the recent idea of pseudoalignment introduced in the RNA-Seq context is highly applicable in the metagenomics setting. When coupled with the Expectation-Maximization (EM) algorithm, reads can be assigned far more accurately and quickly than is currently possible with state of the art software, making it possible and practical for the first time to analyze abundances of individual genomes in metagenomics projects.

Item Type:Article
Related URLs:
URLURL TypeDescription CentralArticle Paper
Melsted, P.0000-0002-8418-6724
Pachter, L.0000-0002-9164-6231
Additional Information:© The Author 2017. Published by Oxford University Press. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model ( Received on October 18, 2016; revised on January 23, 2017; editorial decision on February 15, 2017; accepted on February 17, 2017. Published: 21 February 2017. We thank readers of preprints of this manuscript for helpful suggestions that have improved our method and its description in the paper. H.P. was supported by an NSF graduate research fellowship. P.M. was partially supported by a Fulbright fellowship. L.S and L.P. were partially supported by NIH R01 HG006129 and NIH R01 DK094699. Conflict of Interest: none declared.
Funding AgencyGrant Number
NSF Graduate Research FellowshipUNSPECIFIED
Fulbright FoundationUNSPECIFIED
NIHR01 HG006129
NIHR01 DK094699
Issue or Number:14
PubMed Central ID:PMC5870846
Record Number:CaltechAUTHORS:20170306-131027010
Persistent URL:
Official Citation:L Schaeffer, H Pimentel, N Bray, P Melsted, L Pachter, Pseudoalignment for metagenomic read assignment, Bioinformatics, Volume 33, Issue 14, 15 July 2017, Pages 2082–2088,
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:74793
Deposited On:06 Mar 2017 21:35
Last Modified:11 Nov 2021 05:29

Repository Staff Only: item control page