A Caltech Library Service

Multiple-sequence functional annotation and the generalized hidden Markov phylogeny

McAuliffe, Jon D. and Pachter, Lior and Jordan, Michael I. (2004) Multiple-sequence functional annotation and the generalized hidden Markov phylogeny. Bioinformatics, 20 (12). pp. 1850-1860. ISSN 1367-4803. doi:10.1093/bioinformatics/bth153.

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item:


Motivation: Phylogenetic shadowing is a comparative genomics principle that allows for the discovery of conserved regions in sequences from multiple closely related organisms. We develop a formal probabilistic framework for combining phylogenetic shadowing with feature-based functional annotation methods. The resulting model, a generalized hidden Markov phylogeny (GHMP), applies to a variety of situations where functional regions are to be inferred from evolutionary constraints. Results: We show how GHMPs can be used to predict complete shared gene structures in multiple primate sequences. We also describe shadower, our implementation of such a prediction system. We find that shadower outperforms previously reported ab initio gene finders, including comparative human–mouse approaches, on a small sample of diverse exonic regions. Finally, we report on an empirical analysis of shadower's performance which reveals that as few as five well-chosen species may suffice to attain maximal sensitivity and specificity in exon demarcation. Availability: A Web server is available at

Item Type:Article
Related URLs:
URLURL TypeDescription
Pachter, Lior0000-0002-9164-6231
Additional Information:© 2004 Oxford University Press. Received on September 20, 2003; revised on January 4, 2004; accepted on January 20, 2004. Advance Access publication February 26, 2004. We thank Dario Boffelli and Eddy Rubin for the sequence data that we have analyzed as well as many helpful discussions about the concepts of phylogenetic shadowing. We are also grateful to the anonymous referees for comments that led to several improvements. L.P. was supported in part by a grant from the NIH (R01-HG02362-02). M.J. was supported by a grant from the NSF (IIS-9988642).
Funding AgencyGrant Number
Issue or Number:12
Record Number:CaltechAUTHORS:20170308-135943475
Persistent URL:
Official Citation:Jon D. McAuliffe, Lior Pachter, Michael I. Jordan; Multiple-sequence functional annotation and the generalized hidden Markov phylogeny. Bioinformatics 2004; 20 (12): 1850-1860. doi: 10.1093/bioinformatics/bth153
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:74917
Deposited By: George Porter
Deposited On:08 Mar 2017 22:20
Last Modified:15 Nov 2021 16:29

Repository Staff Only: item control page