A Caltech Library Service

ProtaBank: A repository for protein design and engineering data

Wang, Connie Y. and Chang, Paul M. and Ary, Marie L. and Allen, Benjamin D. and Chica, Roberto A. and Mayo, Stephen L. and Olafson, Barry D. (2018) ProtaBank: A repository for protein design and engineering data. Protein Science, 27 (6). pp. 1113-1124. ISSN 0961-8368. PMCID PMC5980626; PMC6371625.

[img] PDF - Published Version
Creative Commons Attribution Non-commercial No Derivatives.

[img] PDF - Submitted Version
Creative Commons Attribution Non-commercial No Derivatives.

[img] MS Word - Supplemental Material
Creative Commons Attribution Non-commercial No Derivatives.

[img] PDF - Erratum
Creative Commons Attribution Non-commercial No Derivatives.


Use this Persistent URL to link to this item:


We present ProtaBank, a repository for storing, querying, analyzing, and sharing protein design and engineering data in an actively maintained and updated database. ProtaBank provides a format to describe and compare all types of protein mutational data, spanning a wide range of properties and techniques. It features a user‐friendly web interface and programming layer that streamlines data deposition and allows for batch input and queries. The database schema design incorporates a standard format for reporting protein sequences and experimental data that facilitates comparison of results across different data sets. A suite of analysis and visualization tools are provided to facilitate discovery, to guide future designs, and to benchmark and train new predictive tools and algorithms. ProtaBank will provide a valuable resource to the protein engineering community by storing and safeguarding newly generated data, allowing for fast searching and identification of relevant data from the existing literature, and exploring correlations between disparate data sets. ProtaBank invites researchers to contribute data to the database to make it accessible for search and analysis. ProtaBank is available at

Item Type:Article
Related URLs:
URLURL TypeDescription CentralArticle Paper CentralCorrection
Wang, Connie Y.0000-0003-2971-3971
Ary, Marie L.0000-0002-0756-1746
Allen, Benjamin D.0000-0001-6914-5572
Mayo, Stephen L.0000-0002-9785-5018
Additional Information:© 2018 The Authors. Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society. This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made. Issue Online 28 May 2018; Version of Record online: 30 April 2018; Accepted manuscript online: 25 March 2018; Manuscript accepted: 21 March 2018; Manuscript revised: 13 March 2018; Manuscript received: 15 February 2018. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Conflict of Interest: None declared.
Errata:The authors declare the following competing financial interest(s): BDA, SLM and BDO are co‐founders/owners of Protabit LLC, a limited liability company. CYW and PMC are employees of Protabit LLC and have ownership interests in Protabit LLC. MLA is an independent contractor and receives compensation from Protabit LLC. RAC declares no competing financial interests. Protabit LLC owns the rights to the ProtaBank Database and associated tools and technologies. Current address: Protabit LLC, 1010 E Union Street, Suite 110, Pasadena, CA 91106. Data accessibility statement: All data stored in the ProtaBank Database are free and accessible to all users. Data from individual study pages may be downloaded to the user's device in Excel or csv formats, or be copied to the users clipboard. Protabit LLC may develop and commercialize certain aspects of the ProtaBank technology, including providing separate databases and tools for commercial entities that have a desire to store proprietary data.
Funding AgencyGrant Number
Subject Keywords:protein engineering; protein design; relational database; protein mutants; data resource; protein stability; data sets
Issue or Number:6
PubMed Central ID:PMC5980626; PMC6371625
Record Number:CaltechAUTHORS:20180402-082750705
Persistent URL:
Official Citation:Wang, C. Y., Chang, P. M., Ary, M. L., Allen, B. D., Chica, R. A., Mayo, S. L. and Olafson, B. D. (2018), ProtaBank: A repository for protein design and engineering data. Protein Science, 27: 1113-1124. doi:10.1002/pro.3406
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:85553
Deposited By: Tony Diaz
Deposited On:02 Apr 2018 16:01
Last Modified:09 Mar 2020 13:19

Repository Staff Only: item control page