State-specific protein–ligand complex structure prediction with a multiscale deep generative model

Qiao, Zhuoran; Nie, Weili; Vahdat, Arash; Miller, Thomas F.; Anandkumar, Animashree

Published February 12, 2024 | Version v2

Journal Article Open

State-specific protein–ligand complex structure prediction with a multiscale deep generative model

1. California Institute of Technology

The binding complexes formed by proteins and small molecule ligands are ubiquitous and critical to life. Despite recent advancements in protein structure prediction, existing algorithms are so far unable to systematically predict the binding ligand structures along with their regulatory effects on protein folding. To address this discrepancy, we present NeuralPLexer, a computational approach that can directly predict protein–ligand complex structures solely using protein sequence and ligand molecular graph inputs. NeuralPLexer adopts a deep generative model to sample the three-dimensional structures of the binding complex and their conformational changes at an atomistic resolution. The model is based on a diffusion process that incorporates essential biophysical constraints and a multiscale geometric deep learning system to iteratively sample residue-level contact maps and all heavy-atom coordinates in a hierarchical manner. NeuralPLexer achieves state-of-the-art performance compared with all existing methods on benchmarks for both protein–ligand blind docking and flexible binding-site structure recovery. Moreover, owing to its specificity in sampling both ligand-free-state and ligand-bound-state ensembles, NeuralPLexer consistently outperforms AlphaFold2 in terms of global protein structure accuracy on both representative structure pairs with large conformational changes and recently determined ligand-binding proteins. NeuralPLexer predictions align with structure determination experiments for important targets in enzyme engineering and drug discovery, suggesting its potential for accelerating the design of functional proteins and small molecules at the proteome scale.

Copyright and License

Acknowledgement

Z.Q. acknowledges graduate research funding from Caltech and partial support from the Amazon-Caltech AI4Science fellowship. T.F.M. acknowledges partial support from the Caltech DeLogi fund, and A.A. acknowledges support from a Caltech Bren professorship. We thank M. Welborn, F. R. Manby, C. Zhang and V. Bhethanabotla for discussions on the work and for comments on the manuscript. We thank A. Meller and J. Borowsky for sharing the PocketMiner dataset.

Contributions

Z.Q., W.N., A.V., T.F.M. and A.A. conceived and designed the experiments. Z.Q. performed the experiments. Z.Q., W.N., A.V., T.F.M. and A.A. analysed the data. Z.Q. contributed analysis tools. Z.Q. and A.A. wrote the paper.

Data Availability

All datasets and predictions used to generate the reported results are available on Code Ocean⁸⁶ and also on Zenodo at https://doi.org/10.5281/zenodo.10373581.

Code Availability

The code, scripts and interactive data analysis notebooks are available on Code Ocean⁸⁶ and also on GitHub at https://github.com/zrqiao/NeuralPLexer.

Conflict of Interest

Z.Q. and T.F.M. are currently employees of Iambic Therapeutics or its affiliates. A provisional patent application related to this work has been filed (US Patent App. provisional 63/496,899). The remaining authors declare no competing interests.

Files

42256_2024_792_MOESM1_ESM.pdf

Files (1.7 MB)

Name	Size	Download all
42256_2024_792_Fig6_ESM.jpg md5:f9e0f3203ad592f517c101a912c194c0	261.2 kB	Preview Download
42256_2024_792_MOESM1_ESM.pdf md5:54091b9cbbd6704a070150e807454853	1.5 MB	Preview Download

Additional details

URL: https://rdcu.be/dyED1
DOI: 10.1038/s42256-024-00792-z

Amazon–Caltech AI4Science fellowship
California Institute of Technology
DeLogi Fund
California Institute of Technology
Bren Professor of Computing and Mathematical Sciences

	All versions	This version
Views	74	71
Downloads	30	27
Data volume	44.4 MB	38.6 MB

State-specific protein–ligand complex structure prediction with a multiscale deep generative model

Copyright and License

Acknowledgement

Contributions

Data Availability

Code Availability

Conflict of Interest

Files

42256_2024_792_MOESM1_ESM.pdf

Files (1.7 MB)

Additional details

Identifiers

Funding

State-specific protein–ligand complex structure prediction with a multiscale deep generative model

Creators

Abstract

Copyright and License

Acknowledgement

Contributions

Data Availability

Code Availability

Conflict of Interest

Files

42256_2024_792_MOESM1_ESM.pdf

Files (1.7 MB)

Additional details

Identifiers

Funding