CaltechAUTHORS
  A Caltech Library Service

Alignment Metric Accuracy

Schwartz, Ariel S. and Myers, Eugene W. and Pachter, Lior (2005) Alignment Metric Accuracy. . (Submitted) http://resolver.caltech.edu/CaltechAUTHORS:20170307-100435592

[img] PDF - Submitted Version
See Usage Policy.

221Kb

Use this Persistent URL to link to this item: http://resolver.caltech.edu/CaltechAUTHORS:20170307-100435592

Abstract

We propose a metric for the space of multiple sequence alignments that can be used to compare two alignments to each other. In the case where one of the alignments is a reference alignment, the resulting accuracy measure improves upon previous approaches, and provides a balanced assessment of the fidelity of both matches and gaps. Furthermore, in the case where a reference alignment is not available, we provide empirical evidence that the distance from an alignment produced by one program to predicted alignments from other programs can be used as a control for multiple alignment experiments. In particular, we show that low accuracy alignments can be effectively identified and discarded. We also show that in the case of pairwise sequence alignment, it is possible to find an alignment that maximizes the expected value of our accuracy measure. Unlike previous approaches based on expected accuracy alignment that tend to maximize sensitivity at the expense of specificity, our method is able to identify unalignable sequence, thereby increasing overall accuracy. In addition, the algorithm allows for control of the sensitivity/specificity tradeoff via the adjustment of a single parameter. These results are confirmed with simulation studies that show that unalignable regions can be distinguished from homologous, conserved sequences. Finally, we propose an extension of the pairwise alignment method to multiple alignment. Our method, which we call AMAP, outperforms existing protein sequence multiple alignment programs on benchmark datasets. A webserver and software downloads are available at http://bio.math.berkeley.edu/amap/.


Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription
https://arxiv.org/abs/q-bio/0510052arXivDiscussion Paper
Additional Information:A.S was partially supported by NSF grant EF 03-31494. G.M. was supported by the Max-Planck / Alexander von Humboldt International Research Prize. L.P. was partially supported by a Sloan Research Fellowship.
Funders:
Funding AgencyGrant Number
NSFEF-0331494
Max-Planck SocietyUNSPECIFIED
Alexander von Humboldt FoundationUNSPECIFIED
Alfred P. Sloan FoundationUNSPECIFIED
Record Number:CaltechAUTHORS:20170307-100435592
Persistent URL:http://resolver.caltech.edu/CaltechAUTHORS:20170307-100435592
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:74838
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:07 Mar 2017 18:08
Last Modified:07 Mar 2017 18:08

Repository Staff Only: item control page