A Caltech Library Service

Majority of divergence between closely related DNA samples is due to indels

Britten, Roy J. and Rowen, Lee and Williams, John and Cameron, R. Andrew (2003) Majority of divergence between closely related DNA samples is due to indels. Proceedings of the National Academy of Sciences of the United States of America, 100 (8). pp. 4661-4665. ISSN 0027-8424. PMCID PMC153612.

PDF - Published Version
See Usage Policy.


Use this Persistent URL to link to this item:


It was recently shown that indels are responsible for more than twice as many unmatched nucleotides as are base substitutions between samples of chimpanzee and human DNA. A larger sample has now been examined and the result is similar. The number of indels is approximate to1/12th of the number of base substitutions and the average length of the indels is 36 nt, including indels up to 10 kb. The ratio (R-u) of unpaired nucleotides attributable to indels to those attributable to substitutions is 3.0 for this 2 million-nt chimp DNA sample compared with human. There is similar evidence of a large value of R-u for sea urchins from the polymorphism of a sample of Strongylocentrotus purpuratus DNA (Ru = 3-4). Other work indicates that similarly, per nucleotide affected, large differences are seen for indels in the DNA polymorphism of the plant Arabidopsis thaliana (R-u = 51). For the insect Drosophila melanogaster a high value of R-u (4.5) has been determined. For the nematode Caenorhabditis elegans the polymorphism data are incomplete but high values of R-u are likely. Comparison of two strains of Escherichia coli O157:H7 shows a preponderance of indels. Because these six examples are from very distant systematic groups the implication is that in general, for alignments of closely related DNA, indels are responsible for many more unmatched nucleotides than are base substitutions. Human genetic evidence suggests that indels are a major source of gene defects, indicating that indels are a significant source of evolutionary change.

Item Type:Article
Related URLs:
URLURL TypeDescription CentralArticle
Cameron, R. Andrew0000-0003-3947-6041
Additional Information:Copyright © 2003 by the National Academy of Sciences. Contributed by Roy J. Britten, February 15, 2003. Published online before print April 2, 2003, 10.1073/pnas.0330964100 We thank Dmitri Petrov for help with the Drosophila data, Hugh M. Robertson for help with the C. elegans measurements, Tanya Berardini for making available Arabidopsis data, and Tetsuya Hayashi for help with the E. coli comparisons. R.A.C. was supported by Grant IBN-9982875 from the National Science Foundation Developmental Mechanisms Program.
Funding AgencyGrant Number
Issue or Number:8
PubMed Central ID:PMC153612
Record Number:CaltechAUTHORS:BRIpnas03
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:1338
Deposited By: Tony Diaz
Deposited On:11 Jan 2006
Last Modified:02 Oct 2019 22:42

Repository Staff Only: item control page