A Caltech Library Service

MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect

Tareen, Ammar and Kooshkbaghi, Mahdi and Posfai, Anna and Ireland, William T. and McCandlish, David M. and Kinney, Justin B. (2022) MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect. Genome Biology, 23 . Art. No. 98. ISSN 1474-760X. doi:10.1186/s13059-022-02661-7.

[img] PDF - Published Version
Creative Commons Attribution.

[img] PDF (January 2, 2022) - Submitted Version
Creative Commons Attribution.

[img] PDF - Supplemental Material
Creative Commons Attribution.

[img] MS Word (Peer review history) - Supplemental Material
Creative Commons Attribution.


Use this Persistent URL to link to this item:


Multiplex assays of variant effect (MAVEs) are a family of methods that includes deep mutational scanning experiments on proteins and massively parallel reporter assays on gene regulatory sequences. Despite their increasing popularity, a general strategy for inferring quantitative models of genotype-phenotype maps from MAVE data is lacking. Here we introduce MAVE-NN, a neural-network-based Python package that implements a broadly applicable information-theoretic framework for learning genotype-phenotype maps—including biophysically interpretable models—from MAVE datasets. We demonstrate MAVE-NN in multiple biological contexts, and highlight the ability of our approach to deconvolve mutational effects from otherwise confounding experimental nonlinearities and noise.

Item Type:Article
Related URLs:
URLURL TypeDescription
https://mavenn.readthedocs.ioRelated ItemDocumentation ItemCode Paper
Kooshkbaghi, Mahdi0000-0002-6344-7382
Ireland, William T.0000-0003-0971-2904
Kinney, Justin B.0000-0003-1897-3778
Alternate Title:MAVE-NN: Quantitative Modeling of Genotype-Phenotype Maps as Information Bottlenecks
Additional Information:© The Author(s) 2022. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data. Received 16 July 2021; Accepted 24 March 2022; Published 15 April 2022. This work was supported by NIH grant R35GM133777 (awarded to JBK), NIH Grant R35GM133613 (awarded to DMM), an Alfred P. Sloan Research Fellowship (awarded to DMM), a grant from the CSHL/Northwell Health partnership, and funding from the Simons Center for Quantitative Biology at Cold Spring Harbor Laboratory. Review history: The review history is available as Additional file 2. Peer review information: Tim Sands was the primary editor of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team. Contributions: AT, WTI, DMM, and JBK conceived the project. AT and JBK wrote the software with assistance from AP and MK. WTI and JBK wrote a preliminary version of the software. AT, MK, and JBK performed the data analysis. JBK, AT, and DMM wrote the manuscript with contributions from MK and AP. All author(s) read and approved the final manuscript. Ethics approval and consent to participate: Not applicable. The authors declare that they have no competing interests.
Funding AgencyGrant Number
Alfred P. Sloan FoundationUNSPECIFIED
Northwell HealthUNSPECIFIED
Cold Spring Harbor LaboratoryUNSPECIFIED
Simons FoundationUNSPECIFIED
Subject Keywords:multiplex assay of variant effect; neural networks; deep mutational scanning; massively parallel reporter assay; global epistasis; mutual information
Record Number:CaltechAUTHORS:20200716-073040475
Persistent URL:
Official Citation:Tareen, A., Kooshkbaghi, M., Posfai, A. et al. MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect. Genome Biol 23, 98 (2022).
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:104394
Deposited By: Tony Diaz
Deposited On:16 Jul 2020 16:09
Last Modified:19 Apr 2022 20:11

Repository Staff Only: item control page