Genome sequence of the hyperthermophilic crenarchaeon Pyrobaculum aerophilum
We determined and annotated the complete 2.2-megabase genome sequence of Pyrobaculum aerophilum, a facultatively aerobic nitrate-reducing hyperthermophilic (T-opt = 100 degrees C) crenarchaeon. Clues were found suggesting explanations of the organism's surprising intolerance to sulfur, which may aid in the development of methods for genetic studies of the organism. Many interesting features worthy of further genetic studies were revealed. Whole genome computational analysis confirmed experiments showing that P. aerophilum (and perhaps all crenarchaea) lack 5' untranslated regions in their mRNAs and thus appear not to use a ribosome-binding site (Shine-Dalgarno)-based mechanism for translation initiation at the 5' end of transcripts. Inspection of the lengths and distribution of mononucleotide repeat-tracts revealed some interesting features. For instance, it was seen that mononucleotide repeat-tracts of Gs (or Cs) are highly unstable, a pattern expected for an organism deficient in mismatch repair. This result, together with an independent study on mutation rates, suggests a "mutator" phenotype.
Additional InformationCopyright © 2002 by the National Academy of Sciences. Contributed by Melvin I. Simon, November 30, 2001. Published online before print January 15, 2002, 10.1073/pnas.241636498. We thank Mark Borodovsky and John Besemer for generous help with gene prediction, including development of the GENEMARKS program to take advantage of P. aerophilum's unusually strong upstream signals for start site prediction. We thank Terry Gaasterland for help with implementation, maintenance, and use of the MAGPIE system and for useful discussions. We thank Todd Lowe for manual tRNA and sRNA predictions and discussions. We thank Fredrick Blattner and DNAstar (Madison, WI) for use of the GENVISION software (see the supporting information on the PNAS web site, www.pnas.org). J.H.M. was supported by grants from the U.S. Office of Naval Research (ONR) and the National Institutes of Health (NIH) (GM57917). K.O.S. was supported by grants from the Deutsche Forschungsgemeinschaft and the Fonds der Chemischen Industrie. S.T.F. received support from the National Aeronautics and Space Administration through the Astrobiology Institute and from grants to J.H.M. from the ONR and NIH (GM57917). The bulk of the raw sequence was obtained at the California Institute of Technology sequencing facility and was funded by grants to M.S. from the Genome Project of the U.S. Department of Energy. Data deposition: The sequence reported in this paper has been deposited in the GenBank database (accession no. AE009441). The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.
Published - FITpnas02.pdf