A Caltech Library Service

Gene-level differential analysis at transcript-level resolution

Yi, Lynn and Pimentel, Harold and Bray, Nicolas L. and Pachter, Lior (2018) Gene-level differential analysis at transcript-level resolution. Genome Biology, 19 . Art. No. 53. ISSN 1474-760X. PMCID PMC5896116. doi:10.1186/s13059-018-1419-z.

[img] PDF - Published Version
Creative Commons Attribution.

[img] PDF - Supplemental Material
Creative Commons Attribution.

[img] PDF - Submitted Version
Creative Commons Attribution.


Use this Persistent URL to link to this item:


Compared to RNA-sequencing transcript differential analysis, gene-level differential expression analysis is more robust and experimentally actionable. However, the use of gene counts for statistical analysis can mask transcript-level dynamics. We demonstrate that ‘analysis first, aggregation second,’ where the p values derived from transcript analysis are aggregated to obtain gene-level results, increase sensitivity and accuracy. The method we propose can also be applied to transcript compatibility counts obtained from pseudoalignment of reads, which circumvents the need for quantification and is fast, accurate, and model-free. The method generalizes to various levels of biology and we showcase an application to gene ontologies.

Item Type:Article
Related URLs:
URLURL TypeDescription CentralArticle Paper Paper
Yi, Lynn0000-0003-4575-0158
Pachter, Lior0000-0002-9164-6231
Additional Information:© 2018 The Author(s). This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated. We thank Jase Gehring, Páll Melsted, and Vasilis Ntranos for discussion and feedback during development of the methods. Conversations with Cole Trapnell regarding the challenges of functional characterization of individual isoforms were instrumental in launching the project. LY was partially funded by the UCLA-Caltech Medical Science Training Program, NIH T32 GM07616, and the Lee Ramo Fund. Harold Pimentel was partially funded by NIH R01 HG008140. Availability of data and materials: Scripts to reproduce the figures and results of the paper are available at, which is under GNU General Public License v3.0. [33]. The RNA-seq datasets used in the analysis can be found at GEO GSE89024 [21]and GEO GSE95363 [25]. Authors’ contributions: LY, NLB, and LP devised the methods. LY analyzed the biological data. LY and LP performed computational experiments. HP developed and implemented the simulation framework. LY and LP wrote the paper. NLB and LP supervised the research. All authors read and approved the final manuscript. Ethics approval and consent to participate: No data from humans were used in this manuscript. The authors declare that they have no competing interests.
Funding AgencyGrant Number
Caltech- Medical Science Training ProgramUNSPECIFIED
NIH Predoctoral FellowshipT32 GM07616
NIHR01 HG008140
Subject Keywords:RNA-sequencing; Differential expression; Meta-analysis P value aggregation; Lancaster method; Fisher’s method; Šidák correction; RNA-seq quantification; RNA-seq alignment; Pseudo; alignment; Transcript compatibility counts; Gene ontology
PubMed Central ID:PMC5896116
Record Number:CaltechAUTHORS:20180416-090553011
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:85872
Deposited By: Tony Diaz
Deposited On:16 Apr 2018 21:18
Last Modified:01 Jun 2023 23:37

Repository Staff Only: item control page