Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published June 2018 | Supplemental Material + Erratum + Submitted + Published
Journal Article Open

ProtaBank: A repository for protein design and engineering data


We present ProtaBank, a repository for storing, querying, analyzing, and sharing protein design and engineering data in an actively maintained and updated database. ProtaBank provides a format to describe and compare all types of protein mutational data, spanning a wide range of properties and techniques. It features a user‐friendly web interface and programming layer that streamlines data deposition and allows for batch input and queries. The database schema design incorporates a standard format for reporting protein sequences and experimental data that facilitates comparison of results across different data sets. A suite of analysis and visualization tools are provided to facilitate discovery, to guide future designs, and to benchmark and train new predictive tools and algorithms. ProtaBank will provide a valuable resource to the protein engineering community by storing and safeguarding newly generated data, allowing for fast searching and identification of relevant data from the existing literature, and exploring correlations between disparate data sets. ProtaBank invites researchers to contribute data to the database to make it accessible for search and analysis. ProtaBank is available at https://protabank.org.

Additional Information

© 2018 The Authors. Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society. This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made. Issue Online 28 May 2018; Version of Record online: 30 April 2018; Accepted manuscript online: 25 March 2018; Manuscript accepted: 21 March 2018; Manuscript revised: 13 March 2018; Manuscript received: 15 February 2018. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Conflict of Interest: None declared.


The authors declare the following competing financial interest(s): BDA, SLM and BDO are co‐founders/owners of Protabit LLC, a limited liability company. CYW and PMC are employees of Protabit LLC and have ownership interests in Protabit LLC. MLA is an independent contractor and receives compensation from Protabit LLC. RAC declares no competing financial interests. Protabit LLC owns the rights to the ProtaBank Database and associated tools and technologies. Current address: Protabit LLC, 1010 E Union Street, Suite 110, Pasadena, CA 91106. Data accessibility statement: All data stored in the ProtaBank Database are free and accessible to all users. Data from individual study pages may be downloaded to the user's device in Excel or csv formats, or be copied to the users clipboard. Protabit LLC may develop and commercialize certain aspects of the ProtaBank technology, including providing separate databases and tools for commercial entities that have a desire to store proprietary data.

Attached Files

Published - Wang_et_al-2018-Protein_Science.pdf

Submitted - 272211.full.pdf

Supplemental Material - pro3406-sup-0001-suppinfo01.docx

Erratum - Wang_et_al-2019-Protein_Science.pdf


Files (14.4 MB)
Name Size Download all
9.3 MB Preview Download
64.0 kB Preview Download
2.7 MB Download
2.3 MB Preview Download

Additional details

August 21, 2023
October 23, 2023