Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published March 23, 2023 | Submitted
Report Open

Flexible parsing and preprocessing of technical sequences with splitcode

Abstract

Next-generation sequencing libraries are constructed with numerous synthetic constructs such as sequencing adapters, barcodes, and unique molecular identifiers. Such sequences can be essential for interpreting results of sequencing assays, and when they contain information pertinent to an experiment, they must be processed and analyzed. We present a tool called splitcode, that enables flexible and efficient preprocessing, parsing, and manipulation of sequencing reads. The splitcode program is free, open source, and available for download at http://github.com/pachterlab/splitcode. This versatile tool will facilitate simple, reproducible preprocessing of reads from libraries constructed for a large array of single-cell and bulk sequencing assays.

Additional Information

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license. We thank the laboratory of Mitchell Guttman (Caltech) for discussions which motivated this project. Some of the splitcode code is derived from code written by Páll Melsted (University of Iceland), and we are grateful to him for sharing his code with us. Thanks to A. Sina Booeshaghi for helpful discussions regarding seqspec and splitcode. Illustrations were created with BioRender: http://biorender.com. D.K.S. was funded by the UCLA-Caltech Medical Scientist Training Program (NIH NIGMS training grant T32 GM008042). L.P. was supported in part by the National Institutes of Health (NIH) grants U19MH114830 and 5UM1HG012077-02. The authors declare no conflicts of interest. Contributions. D.K.S. conceived of the work, developed the methods and software, and drafted the manuscript. L.P. supervised the work. Both authors reviewed and edited the manuscript. Code Availability. The splitcode software is available at http://github.com/pachterlab/splitcode. The version of the splitcode software referred to throughout this paper is version 0.28.0.

Attached Files

Submitted - nihpp-2023.03.20.533521v2.pdf

Files

nihpp-2023.03.20.533521v2.pdf
Files (303.9 kB)
Name Size Download all
md5:41d51f31b63cdb948cc44dfcaf33a128
303.9 kB Preview Download

Additional details

Created:
August 20, 2023
Modified:
June 18, 2024