Flexible parsing, interpretation, and editing of technical sequences with splitcode

Sullivan, Delaney K.; Pachter, Lior

Published June 14, 2024 | Version Published

Journal Article Open

Flexible parsing, interpretation, and editing of technical sequences with splitcode

1. California Institute of Technology

Motivation

Next-generation sequencing libraries are constructed with numerous synthetic constructs such as sequencing adapters, barcodes, and unique molecular identifiers. Such sequences can be essential for interpreting results of sequencing assays, and when they contain information pertinent to an experiment, they must be processed and analyzed.

Results

We present a tool called splitcode, that enables flexible and efficient parsing, interpreting, and editing of sequencing reads. This versatile tool facilitates simple, reproducible preprocessing of reads from libraries constructed for a large array of single-cell and bulk sequencing assays.

Availability

The splitcode program is free, open source, and available for download at http://github.com/pachterlab/splitcode.

Supplementary information

Supplementary data are available at Bioinformatics online.

Copyright and License

© The Author(s) 2024. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Funding

D.K.S. was funded by the UCLA-Caltech Medical Scientist Training Program (NIH NIGMS training grant T32 GM008042). L.P. was supported in part by the National Institutes of Health (NIH) grants U19MH114830 and 5UM1HG012077-02.

Conflict of Interest

none declared

Files

btae331.pdf

Files (1.8 MB)

Name	Size	Download all
btae331.pdf md5:71593f123f915280e907d85c97f4176f	1.5 MB	Preview Download
btae331_supplementary_data.zip md5:90d85617946b2a23b8230ee1d33276f9	232.1 kB	Preview Download

Additional details

PMCID: PMC11193061
DOI: 10.1093/bioinformatics/btae331

Is new version of: Discussion Paper: https://authors.library.caltech.edu/records/s54v6-nf783 (URL)

National Institutes of Health
NIH Predoctoral Fellowship T32 GM008042
National Institutes of Health
U19MH114830
National Institutes of Health
5UM1HG012077-02

Caltech groups: Division of Biology and Biological Engineering (BBE), Tianqiao and Chrissy Chen Institute for Neuroscience

	All versions	This version
Views	34	29
Downloads	20	11
Data volume	18.8 MB	15.6 MB

Flexible parsing, interpretation, and editing of technical sequences with splitcode

Copyright and License

Funding

Conflict of Interest

Files

btae331.pdf

Files (1.8 MB)

Additional details

Identifiers

Related works

Funding

Caltech Custom Metadata

Flexible parsing, interpretation, and editing of technical sequences with splitcode

Creators

Abstract

Copyright and License

Funding

Conflict of Interest

Files

btae331.pdf

Files (1.8 MB)

Additional details

Identifiers

Related works

Funding

Caltech Custom Metadata