Opportunities and Challenges for Machine Learning-Assisted Enzyme Engineering
Abstract
Enzymes can be engineered at the level of their amino acid sequences to optimize key properties such as expression, stability, substrate range, and catalytic efficiency—or even to unlock new catalytic activities not found in nature. Because the search space of possible proteins is vast, enzyme engineering usually involves discovering an enzyme starting point that has some level of the desired activity followed by directed evolution to improve its “fitness” for a desired application. Recently, machine learning (ML) has emerged as a powerful tool to complement this empirical process. ML models can contribute to (1) starting point discovery by functional annotation of known protein sequences or generating novel protein sequences with desired functions and (2) navigating protein fitness landscapes for fitness optimization by learning mappings between protein sequences and their associated fitness values. In this Outlook, we explain how ML complements enzyme engineering and discuss its future potential to unlock improved engineering outcomes.
Copyright and License
© 2024 The Authors. Published by American Chemical Society. This publication is licensed under CC-BY 4.0.
Acknowledgement
This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under Award Number DE-SC0022218. This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof. This work was also supported by an Amgen Chem-Bio-Engineering Award (CBEA) and by the NSF Division of Chemical, Bioengineering, Environmental and Transport Systems (CBET 1937902). J.Y. and F.Z.L are partially supported by National Science Foundation Graduate Research Fellowships. The authors thank Kadina Johnston and Sabine Brinkmann-Chen for helpful discussions and critical reading of the manuscript.
Conflict of Interest
Files
Name | Size | Download all |
---|---|---|
md5:197e224c05df0959d4c5156350f66b70
|
3.5 MB | Preview Download |
Additional details
- ISSN
- 2374-7951
- PMCID
- PMC10906252
- United States Department of Energy
- DE-SC0022218
- Amgen (United States)
- National Science Foundation
- CBET-1937902
- National Science Foundation
- NSF Graduate Research Fellowship
- Caltech groups
- Division of Biology and Biological Engineering