State-specific protein–ligand complex structure prediction with a multiscale deep generative model
Abstract
The binding complexes formed by proteins and small molecule ligands are ubiquitous and critical to life. Despite recent advancements in protein structure prediction, existing algorithms are so far unable to systematically predict the binding ligand structures along with their regulatory effects on protein folding. To address this discrepancy, we present NeuralPLexer, a computational approach that can directly predict protein–ligand complex structures solely using protein sequence and ligand molecular graph inputs. NeuralPLexer adopts a deep generative model to sample the three-dimensional structures of the binding complex and their conformational changes at an atomistic resolution. The model is based on a diffusion process that incorporates essential biophysical constraints and a multiscale geometric deep learning system to iteratively sample residue-level contact maps and all heavy-atom coordinates in a hierarchical manner. NeuralPLexer achieves state-of-the-art performance compared with all existing methods on benchmarks for both protein–ligand blind docking and flexible binding-site structure recovery. Moreover, owing to its specificity in sampling both ligand-free-state and ligand-bound-state ensembles, NeuralPLexer consistently outperforms AlphaFold2 in terms of global protein structure accuracy on both representative structure pairs with large conformational changes and recently determined ligand-binding proteins. NeuralPLexer predictions align with structure determination experiments for important targets in enzyme engineering and drug discovery, suggesting its potential for accelerating the design of functional proteins and small molecules at the proteome scale.
Copyright and License
© The Author(s), under exclusive licence to Springer Nature Limited 2024.
Acknowledgement
Z.Q. acknowledges graduate research funding from Caltech and partial support from the Amazon-Caltech AI4Science fellowship. T.F.M. acknowledges partial support from the Caltech DeLogi fund, and A.A. acknowledges support from a Caltech Bren professorship. We thank M. Welborn, F. R. Manby, C. Zhang and V. Bhethanabotla for discussions on the work and for comments on the manuscript. We thank A. Meller and J. Borowsky for sharing the PocketMiner dataset.
Contributions
Z.Q., W.N., A.V., T.F.M. and A.A. conceived and designed the experiments. Z.Q. performed the experiments. Z.Q., W.N., A.V., T.F.M. and A.A. analysed the data. Z.Q. contributed analysis tools. Z.Q. and A.A. wrote the paper.
Data Availability
All datasets and predictions used to generate the reported results are available on Code Ocean86 and also on Zenodo at https://doi.org/10.5281/zenodo.10373581.
Code Availability
The code, scripts and interactive data analysis notebooks are available on Code Ocean86 and also on GitHub at https://github.com/zrqiao/NeuralPLexer.
Conflict of Interest
Z.Q. and T.F.M. are currently employees of Iambic Therapeutics or its affiliates. A provisional patent application related to this work has been filed (US Patent App. provisional 63/496,899). The remaining authors declare no competing interests.
Files
Name | Size | Download all |
---|---|---|
md5:f9e0f3203ad592f517c101a912c194c0
|
261.2 kB | Preview Download |
md5:54091b9cbbd6704a070150e807454853
|
1.5 MB | Preview Download |
Additional details
- Amazon–Caltech AI4Science fellowship
- California Institute of Technology
- DeLogi Fund
- California Institute of Technology
- Bren Professor of Computing and Mathematical Sciences