Published May 2024 | Published
Journal Article Open

FAIR Header Reference genome: a TRUSTworthy standard

Abstract

The lack of interoperable data standards among reference genome data-sharing platforms inhibits cross-platform analysis while increasing the risk of data provenance loss. Here, we describe the FAIR bioHeaders Reference genome (FHR), a metadata standard guided by the principles of Findability, Accessibility, Interoperability and Reuse (FAIR) in addition to the principles of Transparency, Responsibility, User focus, Sustainability and Technology. The objective of FHR is to provide an extensive set of data serialisation methods and minimum data field requirements while still maintaining extensibility, flexibility and expressivity in an increasingly decentralised genomic data ecosystem. The effort needed to implement FHR is low; FHR’s design philosophy ensures easy implementation while retaining the benefits gained from recording both machine and human-readable provenance.

Copyright and License

Acknowledgement

The authors thank Natalie Meyers with The Lucy Family Institute for Data and Society at the University of Notre Dame for conversations around research communities focused on metadata standards that were used in the writing of the manuscript. The authors thank Monica Poelchau with the National Agriculture Library and Sarah Dyer at EMBL EBI for relevant discussions.

Funding

Adam Wright is supported by the Adaptive Oncology Programme at the Ontario Institute for Cancer Research. During a portion of this project, David Molik is supported by the USDA Agricultural Research Service (ARS) HQ Research Associate program in Big Data.

A portion of this work was carried out by the Tropical Pest Genetics and Molecular Biology Research Unit, ARS Project number 2040-22430-028-000D.

A portion of this work was carried out by the Arthropod borne Animal Diseases Research Unit, ARS Project numbers 3020-32000-018-000-D, 3020-32000-020-000-D, and 3020- 32000-019-000-D.

This research used resources provided by the SCINet project of the USDA Agricultural Research Service, ARS project number 0500-00093-001-00-D.

We gratefully acknowledge the support of the WormBase grant (U24HG002223), which provided funding for this research. This grant has been instrumental in supporting the contributions of Karen Yook, Daniela Raciti, Paul Sternberg, Adam Wright, Lincoln Stein and Scott Cain. Their efforts have significantly contributed to the success of this project.

Data Availability

The code published for FHR is in the public domain per the United States 17 U.S.C. § 105. The code and specification are freely available for use and modification (Table 4).

Files

bbae122.pdf
Files (8.0 MB)
Name Size Download all
md5:9ea8f5ecc67c52c47f61b9dd1343936f
1.0 MB Preview Download
md5:1ce3a9617cb61864614e6a789ab53362
1.0 MB Preview Download
md5:9bfe4ab42a81b8edc38500381026fdc4
882.4 kB Preview Download
md5:745577ec2e8c30188fb9b629d119ccb3
1.1 MB Preview Download
md5:7212e813d2187b39879e1fce986a1273
577.3 kB Preview Download
md5:95447a6f20703e7b70281e2346d50132
1.2 MB Preview Download
md5:4bf28873fe3a6fcc674beb1f68b9db3e
24.3 kB Preview Download
md5:cf5565ed6a92e59dd72867e07bd3cd8f
1.1 MB Preview Download
md5:7c7e505d40f81b222c7e7043bf8f0dda
1.0 MB Preview Download

Additional details

Created:
June 5, 2024
Modified:
June 17, 2024