Flexible parsing, interpretation, and editing of technical sequences with splitcode
Abstract
Next-generation sequencing libraries are constructed with numerous synthetic constructs such as sequencing adapters, barcodes, and unique molecular identifiers. Such sequences can be essential for interpreting results of sequencing assays, and when they contain information pertinent to an experiment, they must be processed and analyzed.
We present a tool called splitcode, that enables flexible and efficient parsing, interpreting, and editing of sequencing reads. This versatile tool facilitates simple, reproducible preprocessing of reads from libraries constructed for a large array of single-cell and bulk sequencing assays.
The splitcode program is free, open source, and available for download at http://github.com/pachterlab/splitcode.
Supplementary data are available at Bioinformatics online.
Copyright and License
Funding
D.K.S. was funded by the UCLA-Caltech Medical Scientist Training Program (NIH NIGMS training grant T32 GM008042). L.P. was supported in part by the National Institutes of Health (NIH) grants U19MH114830 and 5UM1HG012077-02.
Conflict of Interest
none declared
Files
Name | Size | Download all |
---|---|---|
md5:90d85617946b2a23b8230ee1d33276f9
|
232.1 kB | Preview Download |
md5:71593f123f915280e907d85c97f4176f
|
1.5 MB | Preview Download |
Additional details
- PMCID
- PMC11193061
- DOI
- 10.1093/bioinformatics/btae331
- National Institutes of Health
- NIH Predoctoral Fellowship T32 GM008042
- National Institutes of Health
- U19MH114830
- National Institutes of Health
- 5UM1HG012077-02
- Caltech groups
- Division of Biology and Biological Engineering, Tianqiao and Chrissy Chen Institute for Neuroscience