CaltechAUTHORS
  A Caltech Library Service

Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization

Bedbrook, Claire N. and Yang, Kevin K. and Rice, Austin J. and Gradinaru, Viviana and Arnold, Frances H. (2017) Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization. PLOS Computational Biology, 13 (10). Art. No. e1005786. ISSN 1553-7358. PMCID PMC5695628. https://resolver.caltech.edu/CaltechAUTHORS:20171030-140148566

[img] PDF - Published Version
Creative Commons Attribution.

5MB
[img] MS Excel (S1 Data. Localization and expression characterization of ChR chimeras predicted by the models) - Supplemental Material
Creative Commons Attribution.

53kB
[img] Image (TIFF) (S1 Fig. Chimera sequences in training set and their expression, localization, and localization efficiencies) - Supplemental Material
Creative Commons Attribution.

1MB
[img] Image (TIFF) (S2 Fig. Chimera expression and localization cannot be predicted from simple rules) - Supplemental Material
Creative Commons Attribution No Derivatives.

1MB
[img] Image (TIFF) (S3 Fig. GP binary classification model for localization efficiency) - Supplemental Material
Creative Commons Attribution.

544kB
[img] Image (TIFF) (S4 Fig. Chimera block identities for exploration, verification, and optimization sets) - Supplemental Material
Creative Commons Attribution.

1MB
[img] Image (TIFF) (S5 Fig. ROC curves for GP classification expression, localization, and localization efficiency models) - Supplemental Material
Creative Commons Attribution.

989kB
[img] Image (TIFF) (S6 Fig. Comparison of measured expression and localization efficiency for each data set) - Supplemental Material
Creative Commons Attribution.

442kB
[img] Image (TIFF) (S7 Fig. Cell population distributions of expression, localization, and localization efficiency properties for each chimera in the verification and optimization sets compared with parents) - Supplemental Material
Creative Commons Attribution.

1MB
[img] Image (TIFF) (S8 Fig. Predictive ability of GP localization models as a function of training set size) - Supplemental Material
Creative Commons Attribution.

223kB
[img] Image (TIFF) (S9 Fig. Important features for prediction of ChR localization aligned with chimeras with optimal localization) - Supplemental Material
Creative Commons Attribution.

4MB
[img] Image (TIFF) (S10 Fig. GP regression model for ChR expression) - Supplemental Material
Creative Commons Attribution.

824kB
[img] Image (TIFF) (S11 Fig. Sequence and structure features important for prediction of ChR expression) - Supplemental Material
Creative Commons Attribution.

3MB
[img] Image (TIFF) (S12 Fig. Localization of engineered CbChR1 variant chimera 3c) - Supplemental Material
Creative Commons Attribution.

2MB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20171030-140148566

Abstract

There is growing interest in studying and engineering integral membrane proteins (MPs) that play key roles in sensing and regulating cellular response to diverse external signals. A MP must be expressed, correctly inserted and folded in a lipid bilayer, and trafficked to the proper cellular location in order to function. The sequence and structural determinants of these processes are complex and highly constrained. Here we describe a predictive, machine-learning approach that captures this complexity to facilitate successful MP engineering and design. Machine learning on carefully-chosen training sequences made by structure-guided SCHEMA recombination has enabled us to accurately predict the rare sequences in a diverse library of channelrhodopsins (ChRs) that express and localize to the plasma membrane of mammalian cells. These light-gated channel proteins of microbial origin are of interest for neuroscience applications, where expression and localization to the plasma membrane is a prerequisite for function. We trained Gaussian process (GP) classification and regression models with expression and localization data from 218 ChR chimeras chosen from a 118,098-variant library designed by SCHEMA recombination of three parent ChRs. We use these GP models to identify ChRs that express and localize well and show that our models can elucidate sequence and structure elements important for these processes. We also used the predictive models to convert a naturally occurring ChR incapable of mammalian localization into one that localizes well.


Item Type:Article
Related URLs:
URLURL TypeDescription
https://doi.org/10.1371/journal.pcbi.1005786DOIArticle
http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005786PublisherArticle
https://doi.org/10.1371/journal.pcbi.1005786.s001DOIData S1
https://doi.org/10.1371/journal.pcbi.1005786.s002DOIFig. S1
https://doi.org/10.1371/journal.pcbi.1005786.s003DOIFig. S2
https://doi.org/10.1371/journal.pcbi.1005786.s004DOIFig. S3
https://doi.org/10.1371/journal.pcbi.1005786.s005DOIFig. S4
https://doi.org/10.1371/journal.pcbi.1005786.s006DOIFig. S5
https://doi.org/10.1371/journal.pcbi.1005786.s007DOIFig. S6
https://doi.org/10.1371/journal.pcbi.1005786.s008DOIFig. S7
https://doi.org/10.1371/journal.pcbi.1005786.s009DOIFig. S8
https://doi.org/10.1371/journal.pcbi.1005786.s010DOIFig. S9
https://doi.org/10.1371/journal.pcbi.1005786.s011DOIFig. S10
https://doi.org/10.1371/journal.pcbi.1005786.s012DOIFig. S11
https://doi.org/10.1371/journal.pcbi.1005786.s013DOIFig. S12
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5695628/PubMed CentralArticle
ORCID:
AuthorORCID
Bedbrook, Claire N.0000-0003-3973-598X
Yang, Kevin K.0000-0001-9045-6826
Gradinaru, Viviana0000-0001-5868-348X
Arnold, Frances H.0000-0002-4027-364X
Additional Information:© 2017 Bedbrook et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Received: August 9, 2017; Accepted: September 21, 2017; Published: October 23, 2017. We thank Twist Bioscience for synthesizing and cloning ChR sequences as part of their α and β manufacturing programs. We thank the Gradinaru and Arnold labs for helpful discussions. We also thank Dr. John Bedbrook for critical reading of the manuscript. Imaging was performed in the Biological Imaging Facility, with the support of the Caltech Beckman Institute and the Arnold and Mabel Beckman Foundation. Data Availability: All relevant data are either within the paper and its Supporting Information files or published in ref 5. This work is funded by the National Institute for Mental Health R21MH103824 (VG and FHA) and the Institute for Collaborative Biotechnologies through grant number W911F-09-0001 from the U.S. Army Research Office (FHA). The content is solely the responsibility of the authors and does not necessarily reflect the position or policy of the National Center for Research Resources, the National Institutes of Health, or the Government, and no official endorsement should be inferred. VG is a Heritage Principal Investigator supported by the Heritage Medical Research Institute. CNB and AJR are funded by Ruth L. Kirschstein National Research Service Awards (F31MH102913 and F32GM116319, respectively). KKY is a trainee in the Caltech Biotechnology Leadership Program, and has received financial support from the Donna and Benjamin M. Rosen Bioengineering Center. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Authors declare no competing interests. Author Contributions: Conceptualization: Claire N. Bedbrook, Kevin K. Yang, Austin J. Rice, Viviana Gradinaru, Frances H. Arnold. Formal analysis: Claire N. Bedbrook, Kevin K. Yang. Methodology: Claire N. Bedbrook, Kevin K. Yang, Austin J. Rice. Project administration: Frances H. Arnold. Software: Claire N. Bedbrook, Kevin K. Yang. Supervision: Viviana Gradinaru, Frances H. Arnold. Visualization: Claire N. Bedbrook, Kevin K. Yang, Austin J. Rice. Writing ± original draft: Claire N. Bedbrook, Kevin K. Yang. Writing ± review & editing: Claire N. Bedbrook, Kevin K. Yang, Austin J. Rice, Viviana Gradinaru, Frances H. Arnold.
Group:Heritage Medical Research Institute, Rosen Bioengineering Center
Funders:
Funding AgencyGrant Number
NIHR21MH103824
Army Research Office (ARO)W911F-09-0001
Heritage Medical Research InstituteUNSPECIFIED
NIH Predoctoral FellowshipF31MH102913
NIH Predoctoral FellowshipF32GM116319
Donna and Benjamin M. Rosen Bioengineering CenterUNSPECIFIED
Caltech Beckman InstituteUNSPECIFIED
Arnold and Mabel Beckman FoundationUNSPECIFIED
Issue or Number:10
PubMed Central ID:PMC5695628
Record Number:CaltechAUTHORS:20171030-140148566
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20171030-140148566
Official Citation:Bedbrook CN, Yang KK, Rice AJ, Gradinaru V, Arnold FH (2017) Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization. PLoS Comput Biol13(10): e1005786. https://doi.org/10.1371/journal.pcbi.1005786
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:82781
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:30 Oct 2017 21:26
Last Modified:05 Mar 2020 18:18

Repository Staff Only: item control page