CaltechAUTHORS
  A Caltech Library Service

A practical, bioinformatic workflow system for large data sets generated by next-generation sequencing

Cantacessi, Cinzia and Jex, Aaron R. and Hall, Ross S. and Young, Neil D. and Campbell, Bronwyn E. and Joachim, Anja and Nolan, Matthew J. and Abubucker, Sahar and Sternberg, Paul W. and Ranganathan, Shoba and Mitreva, Makedonka and Gasser, Robin B. (2010) A practical, bioinformatic workflow system for large data sets generated by next-generation sequencing. Nucleic Acids Research, 38 (17). Art. No. e171. ISSN 0305-1048. PMCID PMC2943614. https://resolver.caltech.edu/CaltechAUTHORS:20101021-094944627

[img]
Preview
PDF - Published Version
Creative Commons Attribution Non-commercial.

174Kb
[img] MS Word - Supplemental Material
Creative Commons Attribution Non-commercial.

35Kb
[img] MS Word - Supplemental Material
Creative Commons Attribution Non-commercial.

46Kb
[img] MS Excel - Supplemental Material
Creative Commons Attribution Non-commercial.

83Kb
[img] MS Excel - Supplemental Material
Creative Commons Attribution Non-commercial.

1545Kb
[img] MS PowerPoint - Supplemental Material
Creative Commons Attribution Non-commercial.

371Kb
[img] MS PowerPoint - Supplemental Material
Creative Commons Attribution Non-commercial.

446Kb
[img] MS PowerPoint - Supplemental Material
Creative Commons Attribution Non-commercial.

5Mb
[img] MS PowerPoint - Supplemental Material
Creative Commons Attribution Non-commercial.

442Kb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20101021-094944627

Abstract

Transcriptomics (at the level of single cells, tissues and/or whole organisms) underpins many fields of biomedical science, from understanding the basic cellular function in model organisms, to the elucidation of the biological events that govern the development and progression of human diseases, and the exploration of the mechanisms of survival, drug-resistance and virulence of pathogens. Next-generation sequencing (NGS) technologies are contributing to a massive expansion of transcriptomics in all fields and are reducing the cost, time and performance barriers presented by conventional approaches. However, bioinformatic tools for the analysis of the sequence data sets produced by these technologies can be daunting to researchers with limited or no expertise in bioinformatics. Here, we constructed a semi-automated, bioinformatic workflow system, and critically evaluated it for the analysis and annotation of large-scale sequence data sets generated by NGS. We demonstrated its utility for the exploration of differences in the transcriptomes among various stages and both sexes of an economically important parasitic worm (Oesophagostomum dentatum) as well as the prediction and prioritization of essential molecules (including GTPases, protein kinases and phosphatases) as novel drug target candidates. This workflow system provides a practical tool for the assembly, annotation and analysis of NGS data sets, also to researchers with a limited bioinformatic expertise. The custom-written Perl, Python and Unix shell computer scripts used can be readily modified or adapted to suit many different applications. This system is now utilized routinely for the analysis of data sets from pathogens of major socio-economic importance and can, in principle, be applied to transcriptomics data sets from any organism.


Item Type:Article
Related URLs:
URLURL TypeDescription
http://dx.doi.org/10.1093/nar/gkq667 DOIArticle
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2943614/PubMed CentralArticle
ORCID:
AuthorORCID
Sternberg, Paul W.0000-0002-7699-0173
Additional Information:© The Author(s) 2010. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. Received June 2, 2010; Revised July 11, 2010; Accepted July 15, 2010. Staff at WormBase are gratefully acknowledged. The Austrian Ministry for Science and Research approved the animal experimentation (BMWF-68.205/0103-II/10b/2008) and is also acknowledged. C.C. is in receipt of an International Postgraduate Research Scholarship from the Australian Government and a fee-remission scholarship from The University of Melbourne as well as the Clunies Ross (2008) and Sue Newton (2009) awards from the School of Veterinary Science of the same university. Funding: The Australian Research Council; Australian Academy of Science; the Australian-American Fulbright Commission (to R.B.G.); National Human Genome Research Institute and National Institutes of Health (to M.M.).
Funders:
Funding AgencyGrant Number
Australian Research CouncilUNSPECIFIED
Australian Academy of ScienceUNSPECIFIED
Australian-American Fulbright CommissionUNSPECIFIED
National Human Genome Research InstituteUNSPECIFIED
NIHUNSPECIFIED
Australian GovernmentUNSPECIFIED
University of MelbourneUNSPECIFIED
School of Veterinary Science of the University of MelbourneUNSPECIFIED
Issue or Number:17
PubMed Central ID:PMC2943614
Record Number:CaltechAUTHORS:20101021-094944627
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20101021-094944627
Usage Policy:This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
ID Code:20463
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:29 Nov 2010 23:35
Last Modified:03 Oct 2019 02:10

Repository Staff Only: item control page