ProtaBank: A repository for protein design and engineering data
Abstract
We present ProtaBank, a repository for storing, querying, analyzing, and sharing protein design and engineering data in an actively maintained and updated database. ProtaBank provides a format to describe and compare all types of protein mutational data, spanning a wide range of properties and techniques. It features a user‐friendly web interface and programming layer that streamlines data deposition and allows for batch input and queries. The database schema design incorporates a standard format for reporting protein sequences and experimental data that facilitates comparison of results across different data sets. A suite of analysis and visualization tools are provided to facilitate discovery, to guide future designs, and to benchmark and train new predictive tools and algorithms. ProtaBank will provide a valuable resource to the protein engineering community by storing and safeguarding newly generated data, allowing for fast searching and identification of relevant data from the existing literature, and exploring correlations between disparate data sets. ProtaBank invites researchers to contribute data to the database to make it accessible for search and analysis. ProtaBank is available at https://protabank.org.
Additional Information
© 2018 The Authors. Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society. This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made. Issue Online 28 May 2018; Version of Record online: 30 April 2018; Accepted manuscript online: 25 March 2018; Manuscript accepted: 21 March 2018; Manuscript revised: 13 March 2018; Manuscript received: 15 February 2018. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Conflict of Interest: None declared.Errata
The authors declare the following competing financial interest(s): BDA, SLM and BDO are co‐founders/owners of Protabit LLC, a limited liability company. CYW and PMC are employees of Protabit LLC and have ownership interests in Protabit LLC. MLA is an independent contractor and receives compensation from Protabit LLC. RAC declares no competing financial interests. Protabit LLC owns the rights to the ProtaBank Database and associated tools and technologies. Current address: Protabit LLC, 1010 E Union Street, Suite 110, Pasadena, CA 91106. Data accessibility statement: All data stored in the ProtaBank Database are free and accessible to all users. Data from individual study pages may be downloaded to the user's device in Excel or csv formats, or be copied to the users clipboard. Protabit LLC may develop and commercialize certain aspects of the ProtaBank technology, including providing separate databases and tools for commercial entities that have a desire to store proprietary data.Attached Files
Published - Wang_et_al-2018-Protein_Science.pdf
Submitted - 272211.full.pdf
Supplemental Material - pro3406-sup-0001-suppinfo01.docx
Erratum - Wang_et_al-2019-Protein_Science.pdf
Files
Additional details
- PMCID
- PMC5980626
- Eprint ID
- 85553
- Resolver ID
- CaltechAUTHORS:20180402-082750705
- NIH
- R44GM117961
- Created
-
2018-04-02Created from EPrint's datestamp field
- Updated
-
2022-03-01Created from EPrint's last_modified field