Predicting Membrane Protein Expression in Yeast from Sequence-Derived Features

Creators: Schulte, Samuel J.; Saladi, Shyam; Clemons, William M.

Abstract

Despite comprising one-quarter of most organisms' proteome and serving as the target of over half of all drugs, integral membrane proteins remain difficult to characterize. Poor expression in heterologous systems often hinders IMP study, and large-scale efforts to express IMPs have proven time-consuming, costly, and capricious. As such, we recently used quantitative experimental expression studies to train a machine learning model capable of predicting membrane protein expression in Escherichia coli solely from sequence-derived features. Though our linear bacterial model generalizes well to eukaryotic membrane proteins expressed in E. coli, we observe poor prediction for IMPs heterologously expressed in yeast, a host frequently chosen for its greater similarity to higher eukaryotes. Thus, we report a new model capable of predicting IMP expression in Saccharomyces cerevisiae. To avoid overfitting resulting from the limited size of our training dataset, the number of sequence-derived features used to predict expression is reduced from the 89 used for the E. coli model to just eight. Strikingly, in agreement with recent findings in the wet laboratory, the disorder of the C-terminus is identified as the most predictive feature. We additionally incorporate new features, including predicted N- and O-glycosylation and disulfide bond formation, into our algorithm. We are working to verify the model across a wide variety of small- and large-scale expression datasets from the literature. We will share our predictor with the broader community to help accelerate membrane protein biochemical and biophysical study.

Additional Information

Additional details

Views

Downloads

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes

More info on how stats are collected....

Resource type: Journal Article
Publisher: Biophysical Society
Published in: Biophysical Journal, 112(3), 355a-356a, ISSN: 0006-3495.
Conference: 58th Annual Meeting of the Biophysical-Society, San Francisco, CA, 15-19 February 2014