A Caltech Library Service

ProtaBank: A repository for protein design and engineering data

Wang, Connie Y. and Chang, Paul M. and Ary, Marie L. and Allen, Benjamin D. and Chica, Roberto A. and Mayo, Stephen L. and Olafson, Barry D. (2018) ProtaBank: A repository for protein design and engineering data. Protein Science, 27 (6). pp. 1113-1124. ISSN 0961-8368. PMCID PMC5980626.

[img] PDF - Published Version
Creative Commons Attribution Non-commercial No Derivatives.

[img] PDF - Submitted Version
Creative Commons Attribution Non-commercial No Derivatives.

[img] MS Word - Supplemental Material
Creative Commons Attribution Non-commercial No Derivatives.


Use this Persistent URL to link to this item:


We present ProtaBank, a repository for storing, querying, analyzing, and sharing protein design and engineering data in an actively maintained and updated database. ProtaBank provides a format to describe and compare all types of protein mutational data, spanning a wide range of properties and techniques. It features a user‐friendly web interface and programming layer that streamlines data deposition and allows for batch input and queries. The database schema design incorporates a standard format for reporting protein sequences and experimental data that facilitates comparison of results across different data sets. A suite of analysis and visualization tools are provided to facilitate discovery, to guide future designs, and to benchmark and train new predictive tools and algorithms. ProtaBank will provide a valuable resource to the protein engineering community by storing and safeguarding newly generated data, allowing for fast searching and identification of relevant data from the existing literature, and exploring correlations between disparate data sets. ProtaBank invites researchers to contribute data to the database to make it accessible for search and analysis. ProtaBank is available at

Item Type:Article
Related URLs:
URLURL TypeDescription CentralArticle Paper
Wang, Connie Y.0000-0003-2971-3971
Additional Information:© 2018 The Authors. Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society. This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made. Issue Online 28 May 2018; Version of Record online: 30 April 2018; Accepted manuscript online: 25 March 2018; Manuscript accepted: 21 March 2018; Manuscript revised: 13 March 2018; Manuscript received: 15 February 2018. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Conflict of Interest: None declared.
Funding AgencyGrant Number
Subject Keywords:protein engineering; protein design; relational database; protein mutants; data resource; protein stability; data sets
PubMed Central ID:PMC5980626
Record Number:CaltechAUTHORS:20180402-082750705
Persistent URL:
Official Citation:Wang, C. Y., Chang, P. M., Ary, M. L., Allen, B. D., Chica, R. A., Mayo, S. L. and Olafson, B. D. (2018), ProtaBank: A repository for protein design and engineering data. Protein Science, 27: 1113-1124. doi:10.1002/pro.3406
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:85553
Deposited By: Tony Diaz
Deposited On:02 Apr 2018 16:01
Last Modified:26 Oct 2018 21:31

Repository Staff Only: item control page