Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published April 2024 | Published
Journal Article Open

A machine-readable specification for genomics assays

  • 1. ROR icon California Institute of Technology

Abstract

Motivation

Understanding the structure of sequenced fragments from genomics libraries is essential for accurate read preprocessing. Currently, different assays and sequencing technologies require custom scripts and programs that do not leverage the common structure of sequence elements present in genomics libraries.

Results

We present seqspec, a machine-readable specification for libraries produced by genomics assays that facilitates standardization of preprocessing and enables tracking and comparison of genomics assays.

Availability and implementation

The specification and associated seqspec command line tool is available at https://www.doi.org/10.5281/zenodo.10213865.

Copyright and License

Acknowledgement

We thank Delaney Sullivan for helpful discussions and Rahma Elsiesy for helpful feedback on Fig. 1. Discussions with the Impact of Genomics Variation on Function (IGVF) Single-Cell Focus Group helped to shape some features of seqspec. Thanks to Idan Gabdank for useful feedback on seqspec and for suggesting the md5 checksum. Meichen Fang contributed the sci-RNA-seq3 seqspec. This work was primarily undertaken while A.S.B. was at the California Institute of Technology.

Funding

This work was supported in part by NIH [5UM1HG012077-02 to A.S.B. and L.P.]. The authors also acknowledge the Howard Hughes Medical Institute for funding A.S.B. through the Hanna H. Gray Fellows program.

Data Availability

The specification and associated seqspec command line tool is available at https://www.doi.org/10.5281/zenodo.10213865 as well as on GitHub https://github.com/pachterlab/seqspec.

Supplementary data are available at Bioinformatics online.

Conflict of Interest

None declared.

Files

btae168.pdf
Files (2.8 MB)
Name Size Download all
md5:31709ce809598bc78fe909982d6de7e6
2.6 MB Preview Download
md5:06b7f2bdc5def574f5a8c58589af9c09
116.0 kB Download

Additional details

Created:
April 15, 2024
Modified:
April 15, 2024