Published January 8, 2021 | Version Supplemental Material + Published
Journal Article Open

Gene Ontology resource: enriching a GOld mine

Creators

Abstract

The Gene Ontology Consortium (GOC) provides the most comprehensive resource currently available for computable knowledge regarding the functions of genes and gene products. Here, we report the advances of the consortium over the past two years. The new GO-CAM annotation framework was notably improved, and we formalized the model with a computational schema to check and validate the rapidly increasing repository of 2838 GO-CAMs. In addition, we describe the impacts of several collaborations to refine GO and report a 10% increase in the number of GO annotations, a 25% increase in annotated gene products, and over 9,400 new scientific articles annotated. As the project matures, we continue our efforts to review older annotations in light of newer findings, and, to maintain consistency with other ontologies. As a result, 20 000 annotations derived from experimental data were reviewed, corresponding to 2.5% of experimental GO annotations. The website (http://geneontology.org) was redesigned for quick access to documentation, downloads and tools. To maintain an accurate resource and support traceability and reproducibility, we have made available a historical archive covering the past 15 years of GO data with a consistent format and file structure for both the ontology and annotations.

Additional Information

© The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Received: 15 September 2020; Revision received: 22 October 2020; Accepted: 02 December 2020; Published: 08 December 2020. We want to thank all the contributors to the GO resource over the last 20 years (http://geneontology.org/page/acknowledgments-contributors), and all the authors of papers represented in the GO knowledgebase (https://pubmed.ncbi.nlm.nih.gov/?term=loprovGeneOntol[SB]). We would like to recognize the efforts of two members of the GO Consortium who passed away in early 2020, James C. Hu, professor at Texas A&M Department of Biochemistry and Biophysics, and Mary Ellen Shimoyama, associate professor of biomedical engineering at the Medical College of Wisconsin (MCW). We miss their participation in our meetings and discussions. Finally, we would like to acknowledge the immense contribution of Suzanna E. Lewis, one of the founders of the GO project, who retired in 2020. Her vision, creativity, enthusiasm and unshakable commitment to the project have been instrumental in creating one of the most useful projects to bioinformatics and keeping it relevant for over two decades. The GO resource is supported by grants from the National Human Genome Research Institute [U41 HG02273 to P.D.T., P.W.S., S.E.L., J.M.C., J.A.B., supplements to grant U41 HG001315 to J.M.C., U24 HG002223 to P.W.S.]; GO Consortium members are also supported by diverse funding sources: dictyBase is supported by the National Institute of General Medical Sciences [1R24GM137770-01 to R.L.C.]; The EcoliWiki group is supported by the National Institutes of Health [GM089636]; National Science Foundation [1565146]; EMBL-EBI is funded by EMBL core funds; FlyBase is supported by the UK Medical Research Council [MR/N030117/1]; National Human Genome Research Institute [U41HG000739]; InterPro is funded by the Wellcome Trust [108433/Z/15/Z]; Biotechnology and Biological Sciences Research Council [BB/N00521X/1, BB/N019172/1, BB/L024136/1 to R.D.F.]; The Institute for Genome Sciences GO-related work on ECO is supported by the National Science Foundation [1458400]; The Gene Regulation Consortium (GRECO) is supported by Gene Regulation; Ensemble Effort for the Knowledge Commons (GREEKC) COST Action [CA15205]; A.L. and M.L.A. are also supported by the Research Council of Norway [247727]; Functional Gene Annotation, University College London is supported by Alzheimer's Research UK [ARUK-NAS2017A-1 to R.C.L.]; National Institute for Health Research University College London Hospitals Biomedical Research Centre; IntAct and the Complex Portal are supported by the European Molecular Biology Laboratory core funds, Open Targets [OTAR-044, OTAR02-048]; Wellcome Trust grant INVAR [212925/Z/18/Z]; PomBase is supported by the Wellcome Trust [104967/Z/14/Z to S.G.O.]; MGI is supported by the National Human Genome Research Institute [HG 000330, HG 002273]; RGD is supported by the National Heart, Lung, and Blood Institute [HL 64541]; Reactome is supported by the National Human Genome Research Institute [HG 003751]; the TAIR project is funded by academic institutional, corporate and individual subscriptions; TAIR is administered by the 501(c)(3) non-profit Phoenix Bioinformatics; the UniProt Consortium is supported by the National Eye Institute, National Human Genome Research Institute, National Heart, Lung and Blood Institute, National Institute of Allergy and Infectious Diseases, National Institute of Diabetes and Digestive and Kidney Diseases, National Institute of General Medical Sciences; National Institute of Mental Health of the National Institutes of Health [U24HG007822]; National Human Genome Research Institute [U41HG007822, U41HG002273]; National Institute of General Medical Sciences [R01GM080646, P20GM103446, U01GM120953]; Swiss Federal Government through the State Secretariat for Education, Research and Innovation SERI; European Molecular Biology Laboratory core funds; Biotechnology and Biological Sciences Research Council [BB/M011674/1]; the Alzheimer's Research UK [ARUK-NAS2017A-1]; WormBase is supported by the US National Human Genome Research Institute [U24-HG002223]; UK Medical Research Council [MR/S000453/1]; UK Biotechnology and Biological Sciences Research Council [BB/P024610, BB/P024602]; ZFIN is supported by the National Human Genome Research Institute [HG002659 to M.W. and HG010859 to P.W.S]. Gramene contributions are supported by the National Science Foundation award [IOS #1127112] and Planteome contributions are supported by National Science Foundation award [IOS #1340112]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies. Funding for open access charges: National Human Genome Research Institute [U41 HG02273]. Conflict of interest statement: None declared.

Attached Files

Published - gkaa1113.pdf

Supplemental Material - gkaa1113_supplemental_file.pdf

Files

gkaa1113.pdf

Files (1.7 MB)

Name Size Download all
md5:90410987a54555c1a3785bb84e21e072
1.5 MB Preview Download
md5:e2c6b5fb7cefe80acc17f533c3dd3885
250.2 kB Preview Download

Additional details

Identifiers

PMCID
PMC7779012
Eprint ID
107976
Resolver ID
CaltechAUTHORS:20210210-070215975

Funding

NIH
U41 HG02273
NIH
U41 HG001315
NIH
U24 HG002223
NIH
1R24GM137770-01
NIH
GM089636
NSF
DBI-1565146
European Molecular Biology Laboratory (EMBL)
Medical Research Council (UK)
MR/N030117/1
NIH
U41HG000739
Wellcome Trust
108433/Z/15/Z
Biotechnology and Biological Sciences Research Council (BBSRC)
BB/N00521X/1
Biotechnology and Biological Sciences Research Council (BBSRC)
BB/N019172/1
Biotechnology and Biological Sciences Research Council (BBSRC)
BB/L024136/1
NSF
DBI-1458400
Gene Regulation Ensemble Effort for the Knowledge Commons (GREEKC)
CA15205
Research Council of Norway
247727
Alzheimer's Research UK
ARUK-NAS2017A-1
National Institute for Health Research
University College London
Open Targets
OTAR-044
Open Targets
OTAR02-048
Wellcome Trust
212925/Z/18/Z
Wellcome Trust
104967/Z/14/Z
NIH
HG 000330
NIH
HG 002273
NIH
HL 64541
NIH
HG 003751
Phoenix Bioinformatics
NIH
U24HG007822
NIH
U41HG007822
NIH
U41HG002273
NIH
R01GM080646
NIH
P20GM103446
NIH
U01GM120953
State Secretariat for Education, Research and Innovation (SERI)
Biotechnology and Biological Sciences Research Council (BBSRC)
BB/M011674/1
NIH
U24-HG002223
Medical Research Council (UK)
MR/S000453/1
Biotechnology and Biological Sciences Research Council (BBSRC)
BB/P024610
Biotechnology and Biological Sciences Research Council (BBSRC)
BB/P024602/1
NIH
HG002659
NIH
HG010859
NSF
IOS-1127112
NSF
IOS-1340112
NIH
U41 HG02273

Dates

Created
2021-02-10
Created from EPrint's datestamp field
Updated
2021-11-16
Created from EPrint's last_modified field

Caltech Custom Metadata

Caltech groups
Division of Biology and Biological Engineering (BBE)