CaltechAUTHORS
  A Caltech Library Service

The majority of human genes have regions repeated in other human genes

Britten, Roy J. (2005) The majority of human genes have regions repeated in other human genes. Proceedings of the National Academy of Sciences of the United States of America, 102 (15). pp. 5466-5470. ISSN 0027-8424. http://resolver.caltech.edu/CaltechAUTHORS:BRIpnas05

[img]
Preview
PDF
See Usage Policy.

404Kb

Use this Persistent URL to link to this item: http://resolver.caltech.edu/CaltechAUTHORS:BRIpnas05

Abstract

Amino acid sequence comparisons have been made between all of 25,193 human proteins with each of the others by using BLAST software (National Center for Biotechnology Information) and recording the results for regions that are significantly related in sequence, that is, have an expectation of <1x10^-3. The results are presented for each amino acid as the number of identical or similar amino acids matched in these aligned regions. This approach avoids summing or dealing directly with the different regions of any one protein that are often related to different numbers and types of other proteins. The results are presented graphically for a sample of 140 proteins. Relationships are not observed for 26.5% of the 12,728,866 amino acids. The average number of related amino acids is 36.5 for the majority (73.5%) that show relationships. The median number of recognized relationships is ~3 for all of the amino acids, and the maximum number is 718. The results demonstrate the overwhelming importance of gene regional duplication forming families of proteins with related domains and show the variety of the resulting patterns of relationship. The magnitude of the set of relationships leads to the conclusion that the principal process by which new gene functions arise has been by making use of preexisting genes.


Item Type:Article
Additional Information:Contributed by Roy J. Britten, February 17, 2005. Freely available online through the PNAS open access option. John Williams (California Institute of Technology) was responsible for much of the computer analysis and construction of PERL programs. Eric H. Davidson’s laboratory at the California Institute of Technology supported this work.
Subject Keywords:domains; protein; relationships
Record Number:CaltechAUTHORS:BRIpnas05
Persistent URL:http://resolver.caltech.edu/CaltechAUTHORS:BRIpnas05
Alternative URL:http://dx.doi.org/10.1073/pnas.0501008102
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:333
Collection:CaltechAUTHORS
Deposited By: Archive Administrator
Deposited On:02 Jun 2005
Last Modified:26 Dec 2012 08:39

Repository Staff Only: item control page