CaltechAUTHORS
  A Caltech Library Service

MAVE-NN: Quantitative Modeling of Genotype-Phenotype Maps as Information Bottlenecks

Tareen, Ammar and Ireland, William T. and Posfai, Anna and McCandlish, David M. and Kinney, Justin B. (2020) MAVE-NN: Quantitative Modeling of Genotype-Phenotype Maps as Information Bottlenecks. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20200716-073040475

[img] PDF - Submitted Version
Creative Commons Attribution.

2815Kb
[img] PDF - Supplemental Material
Creative Commons Attribution.

790Kb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20200716-073040475

Abstract

Multiplex assays of variant effect (MAVEs) are being rapidly adopted in many areas of biology including gene regulation, protein science, and evolution. However, inferring quantitative models of genotype-phenotype maps from MAVE data remains a challenge. Here we introduce MAVE-NN, a neural-network-based Python package that addresses this problem by conceptualizing genotype-phenotype maps as information bottlenecks. We demonstrate the versatility, performance, and speed of MAVE-NN on a diverse range of published MAVE datasets. MAVE-NN is easy to install and is thoroughly documented at https://mavenn.readthedocs.io.


Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription
https://doi.org/10.1101/2020.07.14.201475DOIDiscussion Paper
https://mavenn.readthedocs.ioRelated ItemData/Code
ORCID:
AuthorORCID
Ireland, William T.0000-0003-0971-2904
Kinney, Justin B.0000-0003-1897-3778
Additional Information:The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license. Posted July 14, 2020. The authors would like to thank Peter Koo for providing valuable comments. This work was supported by NIH grant 1R35GM133777 (awarded to JBK) and NIH Grant 1R35GM133613 (awarded to DMM), an Alfred P. Sloan Research Fellowship (awarded to DMM), a grant from the CSHL/Northwell Health partnership, and funding from the Simons Center for Quantitative Biology at Cold Spring Harbor Laboratory. Availability of data and materials: • Project: mavenn • Documentation: mavenn.readthedocs.io • Programming language: Python • Installation: pip install mavenn • License: MIT • Restrictions on use by non-academics: None. The authors declare that they have no competing interests. Author's contributions: JBK, AT, and DMM conceived the project. AT and JBK wrote the software. AT tested the software and released it as a python package on PYPI. AT, DMM, and JBK wrote the manuscript. WTI wrote a preliminary version of the software. AP performed the gauge fixing analysis. All authors contributed to aspects of the analyses.
Funders:
Funding AgencyGrant Number
NIH1R35GM133777
NIH1R35GM133613
Alfred P. Sloan FoundationUNSPECIFIED
Northwell HealthUNSPECIFIED
Cold Spring Harbor LaboratoryUNSPECIFIED
Subject Keywords:MAVE; Neural Networks; Noise Agnostic Regression; Global Epistasis Regression
Record Number:CaltechAUTHORS:20200716-073040475
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20200716-073040475
Official Citation:MAVE-NN: Quantitative Modeling of Genotype-Phenotype Maps as Information Bottlenecks. Ammar Tareen, William T. Ireland, Anna Posfai, David M. McCandlish, Justin B. Kinney. bioRxiv 2020.07.14.201475; doi: https://doi.org/10.1101/2020.07.14.201475
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:104394
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:16 Jul 2020 16:09
Last Modified:16 Jul 2020 16:09

Repository Staff Only: item control page