CaltechAUTHORS
  A Caltech Library Service

Gene structure in the sea urchin Strongylocentrotus purpuratus based on transcriptome analysis

Tu, Qiang and Cameron, R. Andrew and Worley, Kim C. and Gibbs, Richard A. and Davidson, Eric H. (2012) Gene structure in the sea urchin Strongylocentrotus purpuratus based on transcriptome analysis. Genome Research, 22 (10). pp. 2079-2087. ISSN 1088-9051. PMCID PMC3460201. doi:10.1101/gr.139170.112. https://resolver.caltech.edu/CaltechAUTHORS:20121106-095343573

[img]
Preview
PDF - Published Version
Creative Commons Attribution Non-commercial.

1MB
[img]
Preview
PDF (Figure S1) - Supplemental Material
Creative Commons Attribution Non-commercial.

95kB
[img]
Preview
PDF (Figure S2) - Supplemental Material
Creative Commons Attribution Non-commercial.

498kB
[img] MS Word (Legends) - Supplemental Material
Creative Commons Attribution Non-commercial.

59kB
[img] MS Excel (Table S1) - Supplemental Material
Creative Commons Attribution Non-commercial.

79kB
[img] MS Excel (Table S2) - Supplemental Material
Creative Commons Attribution Non-commercial.

221kB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20121106-095343573

Abstract

A comprehensive transcriptome analysis has been performed on protein-coding RNAs of Strongylocentrotus purpuratus, including 10 different embryonic stages, six feeding larval and metamorphosed juvenile stages, and six adult tissues. In this study, we pooled the transcriptomes from all of these sources and focused on the insights they provide for gene structure in the genome of this recently sequenced model system. The genome had initially been annotated by use of computational gene model prediction algorithms. A large fraction of these predicted genes were recovered in the transcriptome when the reads were mapped to the genome and appropriately filtered and analyzed. However, in a manually curated subset, we discovered that more than half the computational gene model predictions were imperfect, containing errors such as missing exons, prediction of nonexistent exons, erroneous intron/exon boundaries, fusion of adjacent genes, and prediction of multiple genes from single genes. The transcriptome data have been used to provide a systematic upgrade of the gene model predictions throughout the genome, very greatly improving the research usability of the genomic sequence. We have constructed new public databases that incorporate information from the transcriptome analyses. The transcript-based gene model data were used to define average structural parameters for S. purpuratus protein-coding genes. In addition, we constructed a custom sea urchin gene ontology, and assigned about 7000 different annotated transcripts to 24 functional classes. Strong correlations became evident between given functional ontology classes and structural properties, including gene size, exon number, and exon and intron size.


Item Type:Article
Related URLs:
URLURL TypeDescription
http://dx.doi.org/10.1101/gr.139170.112 DOIArticle
http://genome.cshlp.org/content/22/10/2079PublisherArticle
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3460201/PubMed CentralArticle
ORCID:
AuthorORCID
Cameron, R. Andrew0000-0003-3947-6041
Additional Information:© 2012 Published by Cold Spring Harbor Laboratory Press. Received February 14, 2012. Accepted May 16, 2012. Published in Advance June 18, 2012. We are very grateful to Brian Williams (Caltech) for the RNA-seq protocol and technical advice. We thank Igor Antoshechkin and Lorian Schaeffer (Millard and Muriel Jacobs Genetics and Genomics Laboratory, Caltech) for library building and sequencing.We thank Ung-Jin Kim, Qiu Yuan, and Parul Kudtarkar (SpBase) for technical assistance. This work was supported by NIH (P40OD010959, P40RR015044).
Funders:
Funding AgencyGrant Number
NIHP40OD010959
NIHP40RR015044
Issue or Number:10
PubMed Central ID:PMC3460201
DOI:10.1101/gr.139170.112
Record Number:CaltechAUTHORS:20121106-095343573
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20121106-095343573
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:35295
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:06 Nov 2012 22:39
Last Modified:09 Nov 2021 23:14

Repository Staff Only: item control page