CaltechAUTHORS
  A Caltech Library Service

Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts

Ntranos, Vasilis and Kamath, Govinda M. and Zhang, Jesse M. and Pachter, Lior and Tse, David N. (2016) Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts. Genome Biology, 17 (1). Art. No. 112. ISSN 1474-760X. PMCID PMC4881296. http://resolver.caltech.edu/CaltechAUTHORS:20190503-155957743

[img] PDF - Published Version
Creative Commons Attribution.

3687Kb
[img] PDF - Submitted Version
Creative Commons Attribution Non-commercial No Derivatives.

36Mb
[img] PDF (Supplementary Figures) - Supplemental Material
Creative Commons Attribution.

22Mb

Use this Persistent URL to link to this item: http://resolver.caltech.edu/CaltechAUTHORS:20190503-155957743

Abstract

Current approaches to single-cell transcriptomic analysis are computationally intensive and require assay-specific modeling, which limits their scope and generality. We propose a novel method that compares and clusters cells based on their transcript-compatibility read counts rather than on the transcript or gene quantifications used in standard analysis pipelines. In the reanalysis of two landmark yet disparate single-cell RNA-seq datasets, we show that our method is up to two orders of magnitude faster than previous approaches, provides accurate and in some cases improved results, and is directly applicable to data from a wide variety of assays.


Item Type:Article
Related URLs:
URLURL TypeDescription
https://doi.org/10.1186/s13059-016-0970-8DOIArticle
https://www.biorxiv.org/content/10.1101/036863v2DOIDiscussion Paper
https://github.com/govinda-kamath/clustering_on_transcript_compatibility_countsRelated ItemData
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4881296/PubMed CentralArticle
ORCID:
AuthorORCID
Ntranos, Vasilis0000-0002-2477-0670
Pachter, Lior0000-0002-9164-6231
Additional Information:© 2016 Ntranos et al. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Received: 24 February 2016; Accepted: 29 April 2016; Published: 26 May 2016. Availability of data and materials: The code used to generate the results presented in this paper is available online on GitHub [49]. All sequencing reads for the Zeisel et al. dataset [7] are available through Gene Expression Omnibus [GEO:GSE60361] and for the Trapnell et al. dataset [12] through [GEO:GSE52529]. The method is publically available on GitHub (https://github.com/govinda-kamath/clustering_on_transcript_compatibility_counts) under the MIT license. Ethics: No ethics approval was required for this study. We thank Páll Melsted for implementing the pseudo command in kallisto. This is the command that allows for direct output of transcript-compatibility counts via pseudoalignment. We would also like to thank Bo Li, Allon Wagner, and Nir Yosef for useful discussions about single-cell RNA-seq assays and their biases. The authors declare that they have no competing interests. Authors’ contributions: VN, GMK, and JZ conceived the idea of clustering without quantification, performed analyses of data, analyzed and interpreted results, and wrote the manuscript. DNT and LP interpreted results, supervised the project, and wrote the manuscript. All authors read and approved the final manuscript. GMK and JZ are supported by the Center for Science of Information, an NSF Science and Technology Center, under grant agreement CCF-0939370. VN is supported in part by the Center for Science of Information and in part by a gift from Qualcomm Inc. LP is supported in part by the National Human Genome Research Institute of the National Institutes of Health under award number R01HG006129. DNT is supported in part by the Center of Science of Information and in part by the National Human Genome Research Institute of the National Institutes of Health under award number R01HG008164.
Funders:
Funding AgencyGrant Number
NSFCCF-0939370
Center for Science of Information (CSoI)UNSPECIFIED
Qualcomm Inc.UNSPECIFIED
NIHR01HG006129
NIHR01HG008164
Subject Keywords:Minimum Span Tree; Affinity Propagation; Read Alignment; Affinity Propagation Algorithm; Pairwise Distance Matrix
PubMed Central ID:PMC4881296
Record Number:CaltechAUTHORS:20190503-155957743
Persistent URL:http://resolver.caltech.edu/CaltechAUTHORS:20190503-155957743
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:95224
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:03 May 2019 23:25
Last Modified:03 May 2019 23:25

Repository Staff Only: item control page