CaltechAUTHORS
  A Caltech Library Service

Quantifying the tradeoff between sequencing depth and cell number in single-cell RNA-seq

Svensson, Valentine and da Veiga Beltrame, Eduardo and Pachter, Lior (2019) Quantifying the tradeoff between sequencing depth and cell number in single-cell RNA-seq. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20190910-074005263

[img] PDF - Submitted Version
See Usage Policy.

5MB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20190910-074005263

Abstract

The allocation of a sequencing budget when designing single cell RNA-seq experiments requires consideration of the tradeoff between number of cells sequenced and the read depth per cell. One approach to the problem is to perform a power analysis for a univariate objective such as differential expression. However, many of the goals of single-cell analysis requires consideration of the multivariate structure of gene expression, such as clustering. We introduce an approach to quantifying the impact of sequencing depth and cell number on the estimation of a multivariate generative model for gene expression that is based on error analysis in the framework of a variational autoencoder. We find that at shallow depths, the marginal benefit of deeper sequencing per cell significantly outweighs the benefit of increased cell numbers. Above about 15,000 reads per cell the benefit of increased sequencing depth is minor. Code for the workflow reproducing the results of the paper is available at https://github.com/pachterlab/SBP_2019/.


Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription
https://doi.org/10.1101/762773DOIDiscussion Paper
https://github.com/pachterlab/SBP_2019Related ItemCode
https://doi.org/10.22002/d1.1276Related ItemData
ORCID:
AuthorORCID
Svensson, Valentine0000-0002-9217-2330
da Veiga Beltrame, Eduardo0000-0002-1529-9207
Pachter, Lior0000-0002-9164-6231
Additional Information:The copyright holder has placed this preprint in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, remix, or adapt this material for any purpose without crediting the original authors. bioRxiv preprint first posted online Sep. 9, 2019. Code availability: A Snakemake [20] file used to subsample and process the data, together with Python notebooks used for downstream analyses are available on GitHub at https://github.com/pachterlab/SBP_2019/. Scripts and notebooks used to create the figures and results, together with gene count matrices outputted by kallisto bus and H5AD files with the UMI counts for all the subsampled read depths are available on CaltechDATA (https://doi.org/10.22002/d1.1276). Author contributions: V.S. designed the evaluation metric and performed statistical analysis. E.V.B. performed data processing and subsampling. V.S., E.V.B., and L.P. interpreted results and wrote the manuscript. The authors want to thank Romain Lopez for helpful feedback on the manuscript. V.S. and L.P. were funded in part by NIH U19MH114830.
Funders:
Funding AgencyGrant Number
NIHU19MH114830
DOI:10.1101/762773
Record Number:CaltechAUTHORS:20190910-074005263
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20190910-074005263
Official Citation:Quantifying the tradeoff between sequencing depth and cell number in single-cell RNA-seq. Valentine Svensson, Eduardo da Veiga Beltrame, Lior Pachter. bioRxiv 762773; doi: https://doi.org/10.1101/762773
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:98536
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:10 Sep 2019 16:26
Last Modified:16 Nov 2021 17:39

Repository Staff Only: item control page