A Caltech Library Service

Characterization of the human ESC transcriptome by hybrid sequencing

Au, Kin Fai and Sebastiano, Vittorio and Afshar, Pegah Tootoonchi and Durruthy, Jens Durruthy and Lee, Lawrence and Williams, Brian A. and van Bakel, Harm and Schadt, Eric E. and Reijo-Pera, Renee A. and Underwood, Jason G. and Wong, Wing Hung (2013) Characterization of the human ESC transcriptome by hybrid sequencing. Proceedings of the National Academy of Sciences of the United States of America, 110 (50). E4821-E4830. ISSN 0027-8424. PMCID PMC3864310. doi:10.1073/pnas.1320101110.

PDF - Published Version
See Usage Policy.

PDF (Supporting Information) - Supplemental Material
See Usage Policy.


Use this Persistent URL to link to this item:


Although transcriptional and posttranscriptional events are detected in RNA-Seq data from second-generation sequencing, fulllength mRNA isoforms are not captured. On the other hand, thirdgeneration sequencing, which yields much longer reads, has current limitations of lower raw accuracy and throughput. Here, we combine second-generation sequencing and third-generation sequencing with a custom-designed method for isoform identification and quantification to generate a high-confidence isoform dataset for human embryonic stem cells (hESCs). We report 8,084 RefSeq-annotated isoforms detected as full-length and an additional 5,459 isoforms predicted through statistical inference. Over one-third of these are novel isoforms, including 273 RNAs from gene loci that have not previously been identified. Further characterization of the novel loci indicates that a subset is expressed in pluripotent cells but not in diverse fetal and adult tissues; moreover, their reduced expression perturbs the network of pluripotency- associated genes. Results suggest that gene identification, even in well-characterized human cell lines and tissues, is likely far from complete.

Item Type:Article
Related URLs:
URLURL TypeDescription CentralArticle
Additional Information:© 2013 National Academy of Sciences. Freely available online through the PNAS open access option. Contributed by Wing Hung Wong, November 5, 2013 (sent for review August 6, 2013). We thank Dr. Arnold Ludwig Hayer, Dr. Tobias Meyer, Dr. Laughing Bear Torrez, and Dr. Diana Cepeda for help in designing siRNA experiments and for technical help in performing the knockdown experiments. K.F.A. and W.H.W. were supported by the National Human Genome Research Institute (R01HG005717). B.A.W. was partially supported by the California Institute of Technology Beckman Center for Functional Genomics. The Illumina short-read data were generated as part of the Encode project (National Human Genome Research Institute Grants U54 HG004576 and U54 HG006998 to Barbara Wold). V.S., J.D.D., and R.A.R.-P. were supported by the National Heart, Lung, and Blood Institute (U01HL100397) and the National Institute of Child Health and Human Development (U54 HD068158). Author contributions: K.F.A. and W.H.W. designed research; K.F.A., V.S., P.T.A., J.D.D., H.V.B., R.A.R.-P., and J.G.U. performed research; K.F.A., L.L., B.A.W., and E.E.S. contributed new reagents/analytic tools; K.F.A. analyzed data; and K.F.A. and W.H.W. wrote the paper. The authors declare no conflict of interest.
Funding AgencyGrant Number
Caltech Beckman Center for Functional GenomicsUNSPECIFIED
NIHU54 HG004576
NIHU54 HG006998
NIHU54 HD068158
Subject Keywords:isoform discovery; PacBio; hESC transcriptome; alternative splicing; lncNRA
Issue or Number:50
PubMed Central ID:PMC3864310
Record Number:CaltechAUTHORS:20140113-101222866
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:43337
Deposited On:14 Jan 2014 18:28
Last Modified:10 Nov 2021 16:36

Repository Staff Only: item control page