A Caltech Library Service

From First Base: The Sequence of the Tip of the X Chromosome of Drosophila melanogaster, a Comparison of Two Sequencing Strategies

Benos, Panayiotis and Pachter, Lior (2002) From First Base: The Sequence of the Tip of the X Chromosome of Drosophila melanogaster, a Comparison of Two Sequencing Strategies. Genome Research, 11 (5). pp. 710-730. ISSN 1088-9051. PMCID PMC311117. doi:10.1101/gr.173801.

[img] PDF - Published Version
Creative Commons Attribution Non-commercial.


Use this Persistent URL to link to this item:


We present the sequence of a contiguous 2.63 Mb of DNA extending from the tip of the X chromosome ofDrosophila melanogaster. Within this sequence, we predict 277 protein coding genes, of which 94 had been sequenced already in the course of studying the biology of their gene products, and examples of 12 different transposable elements. We show that an interval between bands 3A2 and 3C2, believed in the 1970s to show a correlation between the number of bands on the polytene chromosomes and the 20 genes identified by conventional genetics, is predicted to contain 45 genes from its DNA sequence. We have determined the insertion sites ofP-elements from 111 mutant lines, about half of which are in a position likely to affect the expression of novel predicted genes, thus representing a resource for subsequent functional genomic analysis. We compare the European Drosophila Genome Project sequence with the corresponding part of the independently assembled and annotated Joint Sequence determined through “shotgun” sequencing. Discounting differences in the distribution of known transposable elements between the strains sequenced in the two projects, we detected three major sequence differences, two of which are probably explained by errors in assembly; the origin of the third major difference is unclear. In addition there are eight sequence gaps within the Joint Sequence. At least six of these eight gaps are likely to be sites of transposable elements; the other two are complex. Of the 275 genes in common to both projects, 60% are identical within 1% of their predicted amino-acid sequence and 31% show minor differences such as in choice of translation initiation or termination codons; the remaining 9% show major differences in interpretation.

Item Type:Article
Related URLs:
URLURL TypeDescription CentralArticle
Pachter, Lior0000-0002-9164-6231
Additional Information:© 2002 Cold Spring Harbor Laboratory Press. The Authors acknowledge that six months after the full-issue publication date, the Article will be distributed under a Creative Commons CC-BY-NC License (Attribution-NonCommercial 4.0 International License, Received December 10, 2000. Accepted February 16, 2001. This work was supported by a Contract from the European Commission under Framework Programme 4 (coordinator D.M. Glover), by a grant from the Medical Research Council, London to M.A. and D.M.G., by a grant from the Dirección General de Investigacion Cientı́fica y Técnica to J.M., by a grant from the Hellenic Secretariat General for Science and Technology to K.L., and by a grant from the Deutsche Humangenomprojekt to H.J. R.D.C.S. was supported by a Wellcome Trust Senior Fellowship. We thank many colleagues for their help. We are grateful to Gerry Rubin and his colleagues at the BDGP, particularly Suzanna Lewis, Sima Misra, and Susan Celniker (and, of course, Gerry himself) for the exchange of materials, information, and ideas over the years. Greg Helt of the BDGP was very helpful in providing us with the initial Drosophila gene training set. We also thank Rolf Apweiler and his SWISS-PROT/TrEMBL team at the EBI, particularly Alexander Kanapin and Wolfgang Fleischmann for their help with the protein motif analysis. We also thank Rolf Apweiler, head of that team, for his blessings. Richard Durbin's group at the Sanger Center have been extraordinarily helpful; in particular, Daniel Lawson gave tremendous help with ACeDB despite having to bend double at times. Kim Rutherford of the Pathogen Sequencing Unit at the Sanger Center provided the software to draw Figure 1; without this we may have been lost. We thank Brian Oliver of the NIH, Bethesda for a pre-print copy of his paper on testis ESTs, Leyla Bayraktaroglou (FlyBase group, Harvard) for her help in the curation of reference sequence data sets, and David Judge of the Cambridge School of Biological Sciences Biocomputing Unit for help. The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact. [All of the sequences analyzed in this paper have been deposited in the EMBL-Bank database under the following accession nos.: AL009146, AL009147, AL009171, AL009188–AL009196, AL021067, AL021086, AL021106–AL021108, AL021726, AL021728, AL022017, AL022018, AL022139, AL023873, AL023874, AL023893, AL024453, AL024455–AL024457, AL024485, AL030993, AL030994, AL031024–AL031028, AL031128, AL031173, AL031366, AL031367, AL031581–AL031583, AL031640, AL031765, AL031883, AL031884, AL034388, AL034544, AL035104, AL035105, AL035207, AL035245, AL035331, AL035632, AL049535, AL050231, AL050232, AL109630, AL121804, AL121806, AL132651, AL132792, AL132797, AL133503–AL133506, AL138678, AL138971, AL138972, and Z98269. A single file (FASTA format) of the 2.6-Mb contig is available from] Supplementary data are available from
Funding AgencyGrant Number
European CommissionUNSPECIFIED
Medical Research Council (UK)UNSPECIFIED
Dirección General de Investigacion Cientı́fica y TécnicaUNSPECIFIED
Hellenic Secretariat General for Science and TechnologyUNSPECIFIED
Deutsche HumangenomprojektUNSPECIFIED
Issue or Number:5
PubMed Central ID:PMC311117
Record Number:CaltechAUTHORS:20170309-100309238
Persistent URL:
Official Citation:From First Base: The Sequence of the Tip of the X Chromosome of Drosophila melanogaster, a Comparison of Two Sequencing Strategies Panayiotis V. Benos, Melanie K. Gatt, Lee Murphy, David Harris, Bart Barrell, Concepcion Ferraz, Sophie Vidal, Christine Brun, Jacques Demaille, Edouard Cadieu, Stephane Dreano, Stéphanie Gloux, Valerie Lelaure, Stephanie Mottier, Francis Galibert, Dana Borkova, Belen Miñana, Fotis C. Kafatos, Slava Bolshakov, Inga Sidén-Kiamos, George Papagiannakis, Lefteris Spanos, Christos Louis, Encarnación Madueño, Beatriz de Pablos, Juan Modolell, Annette Peter, Petra Schöttler, Meike Werner, Fotini Mourkioti, Nicole Beinert, Gordon Dowe, Ulrich Schäfer, Herbert Jäckle, Alain Bucheton, Debbie Callister, Lorna Campbell, Nadine S. Henderson, Paul J. McMillan, Cathy Salles, Evelyn Tait, Phillipe Valenti, Robert D.C. Saunders, Alain Billaud, Lior Pachter, David M. Glover, and Michael Ashburner Genome Res. May 1, 2001 11: 710-730; doi:10.1101/gr.173801
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:74973
Deposited By: George Porter
Deposited On:10 Mar 2017 03:35
Last Modified:15 Nov 2021 16:29

Repository Staff Only: item control page