Sequence-based features that are determinant for tail-anchored membrane protein sorting in eukaryotes
The correct targeting and insertion of tail-anchored (TA) integral membrane proteins is critical for cellular homeostasis. TA proteins are defined by a hydrophobic transmembrane domain (TMD) at their C-terminus and are targeted to either the ER or mitochondria. Derived from experimental measurements of a few TA proteins, there has been little examination of the TMD features that determine localization. As a result, the localization of many TA proteins are misclassified by the simple heuristic of overall hydrophobicity. Because ER-directed TMDs favor arrangement of hydrophobic residues to one side, we sought to explore the role of geometric hydrophobic properties. By curating TA proteins with experimentally determined localizations and assessing hypotheses for recognition, we bioinformatically and experimentally verify that a hydrophobic face is the most accurate singular metric for separating ER and mitochondria-destined yeast TA proteins. A metric focusing on an 11 residue segment of the TMD performs well when classifying human TA proteins. The most inclusive predictor uses both hydrophobicity and C-terminal charge in tandem. This work provides context for previous observations and opens the door for more detailed mechanistic experiments to determine the molecular factors driving this recognition.
© 2021 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd. Issue Online: 19 August 2021; Version of Record online: 03 August 2021; Accepted manuscript online: 21 July 2021; Manuscript accepted: 18 July 2021; Manuscript revised: 15 July 2021; Manuscript received: 04 May 2021. Funding information: National Science Foundation Graduate Research, Grant/Award Number: 1144469; NIH/National Research Service Award Training, Grant/Award Number: 5T32GM07616; National Institutes of Health, Grant/Award Number: R01GM097572. Peer Review: The peer review history for this article is available at https://publons.com/publon/10.1111/tra.12809. Data Availability Statement: All code employed is available openly at github.com/clemlab/sgt2a-modeling with analysis done in Jupyter Lab/Notebooks using Python 3.6 enabled by Numpy, Pandas, Scikit-Learn, BioPython, bebi103, and Bokeh as well as in Rstudio/Rmarkdown Notebooks enabled by packages within the Tidyverse ecosystem.
Accepted Version - nihms-1726178.pdf
Supplemental Material - tra12809-sup-0001-supinfo.docx
Supplemental Material - tra12809-sup-0001-tables.xlsx
Supplemental Material - tra12809-sup-0002-figures1.jpg
||1.1 MB||Preview Download|
||1.9 MB||Preview Download|