CaltechAUTHORS
  A Caltech Library Service

Multi-Label Classification Models for the Prediction of Cross-Coupling Reaction Conditions

Maser, Michael and Cui, Alexander and Ryou, Serim and DeLano, Travis J. and Yue, Yisong and Reisman, Sarah (2020) Multi-Label Classification Models for the Prediction of Cross-Coupling Reaction Conditions. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20201015-152733539

[img] PDF - Submitted Version
Creative Commons Attribution Non-commercial No Derivatives.

5Mb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20201015-152733539

Abstract

Machine-learned ranking models have been developed for the prediction of substrate-specific cross-coupling reaction conditions. Datasets of published reactions were curated for Suzuki, Negishi, and C–N couplings, as well as Pauson–Khand reactions. String, descriptor, and graph encodings were tested as input representations, and models were trained to predict the set of conditions used in a reaction as a binary vector. Unique reagent dictionaries categorized by expert-crafted reaction roles were constructed for each dataset, leading to context-aware predictions. We find that relational graph convolutional networks and gradient-boosting machines are very effective for this learning task, and we disclose a novel reaction-level graph-attention operation in the top-performing model.


Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription
https://doi.org/10.26434/chemrxiv.13087769.v1DOIDiscussion Paper
ORCID:
AuthorORCID
DeLano, Travis J.0000-0002-2052-611X
Yue, Yisong0000-0001-9127-1989
Reisman, Sarah0000-0001-8244-9300
Additional Information:© 2020 The Author(s). LICENCE: CC BY-NC-ND 4.0. 14.10.2020 - Submission date. 15.10.2020 - First online date, Posted date. We thank Prof Pietro Perona for mentorship guidance and helpful project discussions, and Chase Blagden for help structuring the GBM experiments. Fellowship support was provided by the NSF (M.R.M., T.J.D Grant No. DGE- 1144469). S.E.R. is a Heritage Medical Research Institute Investigator. Financial support from Research Corporation is warmly acknowledged.
Group:Heritage Medical Research Institute
Funders:
Funding AgencyGrant Number
NSF Graduate Research FellowshipDGE-1144469
Heritage Medical Research InstituteUNSPECIFIED
Research CorporationUNSPECIFIED
Subject Keywords:machine learning; graph neural network; graph attention; gradient-boosting machines; reaction condition prediction; cross-coupling; predictive modeling; molecular machine learning
Record Number:CaltechAUTHORS:20201015-152733539
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20201015-152733539
Official Citation:Maser, Michael; Cui, Alexander; Ryou, Serim; DeLano, Travis; Yue, Yisong; Reisman, Sarah (2020): Multi-Label Classification Models for the Prediction of Cross-Coupling Reaction Conditions. ChemRxiv. Preprint. https://doi.org/10.26434/chemrxiv.13087769.v1
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:106094
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:16 Oct 2020 16:22
Last Modified:11 Nov 2020 00:23

Repository Staff Only: item control page