CaltechAUTHORS
  A Caltech Library Service

Subtree power analysis and species selection for comparative genomics

McAuliffe, Jon D. and Jordan, Michael I. and Pachter, Lior (2005) Subtree power analysis and species selection for comparative genomics. Proceedings of the National Academy of Sciences of the United States of America, 102 (22). pp. 7900-7905. ISSN 0027-8424. PMCID PMC1142384. doi:10.1073/pnas.0502790102. https://resolver.caltech.edu/CaltechAUTHORS:20190503-150942109

[img] PDF - Published Version
See Usage Policy.

344kB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20190503-150942109

Abstract

Sequence comparison across multiple organisms aids in the detection of regions under selection. However, resource limitations require a prioritization of genomes to be sequenced. This prioritization should be grounded in two considerations: the lineal scope encompassing the biological phenomena of interest, and the optimal species within that scope for detecting functional elements. We introduce a statistical framework for optimal species subset selection, based on maximizing power to detect conserved sites. Analysis of a phylogenetic star topology shows theoretically that the optimal species subset is not in general the most evolutionarily diverged subset. We then demonstrate this finding empirically in a study of vertebrate species. Our results suggest that marsupials are prime sequencing candidates.


Item Type:Article
Related URLs:
URLURL TypeDescription
https://doi.org/10.1073/pnas.0502790102DOIArticle
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1142384PubMed CentralArticle
ORCID:
AuthorORCID
Pachter, Lior0000-0002-9164-6231
Additional Information:© 2005 The National Academy of Sciences. Communicated by Peter J. Bickel, University of California, Berkeley, CA, April 6, 2005 (received for review December 13, 2004). We thank Peter Bickel and Adam Siepel for helpful comments. M.I.J. was supported by National Institutes of Health Grant R33-HG003070. L.P. was supported by National Institutes of Health Grant R01-HG2362-3, a Sloan Foundation Research Fellowship, and National Science Foundation Career Award CCF-0347992. Author contributions: J.D.M., M.I.J., and L.P. designed research; J.D.M., M.I.J., and L.P. performed research; J.D.M., M.I.J., and L.P. contributed new reagents/analytic tools; J.D.M. analyzed data; and J.D.M. wrote the paper.
Funders:
Funding AgencyGrant Number
NIHR33-HG003070
NIHR01-HG2362-3
Alfred P. Sloan FoundationUNSPECIFIED
NSFCCF-0347992
Subject Keywords:hypothesis testing; likelihood ratio; sequence analysis
Issue or Number:22
PubMed Central ID:PMC1142384
DOI:10.1073/pnas.0502790102
Record Number:CaltechAUTHORS:20190503-150942109
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20190503-150942109
Official Citation:Subtree power analysis and species selection for comparative genomics. Jon D. McAuliffe, Michael I. Jordan, Lior Pachter. Proceedings of the National Academy of Sciences May 2005, 102 (22) 7900-7905; DOI: 10.1073/pnas.0502790102
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:95217
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:03 May 2019 22:52
Last Modified:16 Nov 2021 17:11

Repository Staff Only: item control page