Total synthesis of Escherichia coli with a recoded genome
Abstract
Nature uses 64 codons to encode the synthesis of proteins from the genome, and chooses 1 sense codon—out of up to 6 synonyms—to encode each amino acid. Synonymous codon choice has diverse and important roles, and many synonymous substitutions are detrimental. Here we demonstrate that the number of codons used to encode the canonical amino acids can be reduced, through the genome-wide substitution of target codons by defined synonyms. We create a variant of Escherichia coli with a four-megabase synthetic genome through a high-fidelity convergent total synthesis. Our synthetic genome implements a defined recoding and refactoring scheme—with simple corrections at just seven positions—to replace every known occurrence of two sense codons and a stop codon in the genome. Thus, we recode 18,214 codons to create an organism with a 61-codon genome; this organism uses 59 codons to encode the 20 amino acids, and enables the deletion of a previously essential transfer RNA.
Additional Information
© 2019 Springer Nature Publishing AG. Received 18 December 2018; Accepted 09 April 2019; Published 15 May 2019. Data availability: The sequences and genome design details used in this study are available in the Supplementary Data. Supplementary Data 1 provides the GenBank file of the E. coli MDS42 genome (NCBI accession number AP012306.1); Supplementary Data 2 provides the GenBank file of the designed synthetic E. coli genome with codon replacements and refactorings; Supplementary Data 3 provides the table of target codons; Supplementary Data 4 provides the table of overlaps and refactoring; Supplementary Data 5 provides the table of 10-kb stretches; Supplementary Data 6 provides the GenBank file of the BAC sacB-cat-rpsL; Supplementary Data 7 provides the GenBank file of BAC-rpsL-kanR-sacB; Supplementary Data 8 provides the GenBank file of the BAC rpsL-kanR-pheS∗-HygR; Supplementary Data 9 provides the table of BAC construction; Supplementary Data 10 provides the table of BAC assembly; Supplementary Data 11 provides the table of REXER experiments; Supplementary Data 12 provides the GenBank file of spacer plasmids without trans-activating CRISPR RNA (tracrRNA) and annotation for linear spacers; Supplementary Data 13 provides the GenBank file of spacer plasmids with tracrRNA and annotation for linear spacers; Supplementary Data 14 provides the table of oligonucleotides used for recoding fixing experiments; Supplementary Data 15 provides the GenBank file of the gentamycin-resistance oriT cassette; Supplementary Data 16 provides the oligonucleotide primers used for conjugation; Supplementary Data 17 provides the GenBank file of the pJF146 F′ plasmid that does not self-transfer; Supplementary Data 18 provides the GenBank file of the fully recoded genome of Syn61, verified by next-generation sequencing; Supplementary Data 19 provides the table of design optimizations and non-programmed mutations; Supplementary Data 20 provides a list of the proteins identified by tandem mass spectrometry; and Supplementary Data 21 provides a list of the primers used for deletion experiments. All other datasets generated and/or analysed in this study are available from the corresponding author upon reasonable request. All materials (Supplementary Data 9, 12, 13, 17, 18) from this study are available from the corresponding author upon reasonable request. Code availability: Code used for genome design is available at https://github.com/TiongSun/genome_recoding; for sequencing at https://github.com/TiongSun/iSeq; and for generating recoding landscapes at https://github.com/TiongSun/recoding_landscape. This work was supported by the Medical Research Council (MRC), UK (MC_U105181009 and MC_UP_A024_1008), the Medical Research Foundation (MRF-109-0003-RG-CHIN/C0741) and an ERC Advanced Grant SGCR, all to J.W.C., and by the Lundbeck Foundation (R232-2016-3474) to J.F. J.W.C. thanks H. Pelham for supporting this project. We thank M. Skehel and the MRC-LMB mass spectrometry service for label-free-quantification-based proteomics; N. Barry for microscopy; A. Crisp for helping with Python scripts; and C. J. K. Wan, S. H. Kim, L. Dunsmore, N. Huguenin-Dezot and S. D. Fried for their support in experimental work. Reviewer information: Nature thanks Abhishek Chatterjee, Tom Ellis and the other anonymous reviewer(s) for their contribution to the peer review of this work. These authors contributed equally: Julius Fredens, Kaihang Wang, Daniel de la Torre, Louise F. H. Funke, Wesley E. Robertson. Author Contributions: K.W. and T.C. designed the target genome sequence. T.C. generated scripts for data analysis. All authors, except T.S.E., contributed to assembly of sections. J.F., L.F.H.F., K.W. and A.G.L. led the fixing of deleterious synthetic sequences. J.F., D.d.l.T., L.F.H.F., W.E.R. and Y.C. led the assembly of sections into Syn61 and characterized the strain with the assistance of T.S.E. J.W.C. supervised the project and wrote the paper with the other authors. The authors declare no competing interests.Attached Files
Supplemental Material - 41586_2019_1192_Fig10_ESM.jpg
Supplemental Material - 41586_2019_1192_Fig11_ESM.jpg
Supplemental Material - 41586_2019_1192_Fig12_ESM.jpg
Supplemental Material - 41586_2019_1192_Fig13_ESM.jpg
Supplemental Material - 41586_2019_1192_Fig14_ESM.jpg
Supplemental Material - 41586_2019_1192_Fig5_ESM.jpg
Supplemental Material - 41586_2019_1192_Fig6_ESM.jpg
Supplemental Material - 41586_2019_1192_Fig7_ESM.jpg
Supplemental Material - 41586_2019_1192_Fig8_ESM.jpg
Supplemental Material - 41586_2019_1192_Fig9_ESM.jpg
Supplemental Material - 41586_2019_1192_MOESM1_ESM.pdf
Supplemental Material - 41586_2019_1192_MOESM2_ESM.pdf
Supplemental Material - 41586_2019_1192_MOESM3_ESM.zip
Files
Name | Size | Download all |
---|---|---|
md5:06e3b011821607ecc54f8d597a6968b3
|
120.9 kB | Preview Download |
md5:1b04106dbb30a1297edad73a68bbe35d
|
111.3 kB | Preview Download |
md5:a1d36a47d315e91d3f6bafbd2a009c23
|
91.9 kB | Preview Download |
md5:0fe6cfa987dc795cf551937823b785af
|
70.4 kB | Preview Download |
md5:cd40f3ab1e7f424a9b65a0081c2a9457
|
4.9 MB | Preview Download |
md5:bdf6c199df56d1082e92e06a29f7883f
|
96.0 kB | Preview Download |
md5:a102b9cea24be89291d0ff4fa012ab17
|
67.0 kB | Preview Download |
md5:61f445572e982463cee3dce93637e9b0
|
81.8 kB | Preview Download |
md5:fae6d8c77552b507f3434d28f1bbd6b5
|
49.4 kB | Preview Download |
md5:58411cd5e9992edbc923ea02f9156df9
|
9.5 MB | Preview Download |
md5:0e7ac60b784e6860692e10c9f80e2b0a
|
76.4 kB | Preview Download |
md5:388fae1def5ae25ff61c664d0dcdbe4d
|
79.1 kB | Preview Download |
md5:c34596d8bb5e2e7c3dcb27c7e5c02241
|
60.1 kB | Preview Download |
Additional details
- Eprint ID
- 95595
- DOI
- 10.1038/s41586-019-1192-5
- Resolver ID
- CaltechAUTHORS:20190520-094828547
- Medical Research Council (UK)
- MC_U105181009
- Medical Research Council (UK)
- MC_UP_A024_1008
- Medical Research Foundation
- MRF-109-0003-RG-CHIN/C0741
- European Research Council (ERC)
- SGCR
- Lundbeck Foundation
- R232-2016-3474
- Created
-
2019-05-20Created from EPrint's datestamp field
- Updated
-
2021-11-16Created from EPrint's last_modified field