Lenz, Andreas and Siegel, Paul H. and Wachter-Zeh, Antonia and Yaakobi, Eitan (2020) Coding over Sets for DNA Storage. IEEE Transactions on Information Theory, 66 (4). pp. 2331-2351. ISSN 0018-9448. doi:10.1109/tit.2019.2961265. https://resolver.caltech.edu/CaltechAUTHORS:20200102-154127218
Full text is not posted in this repository. Consult Related URLs below.
Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20200102-154127218
Abstract
In this paper we study error-correcting codes for the storage of data in synthetic deoxyribonucleic acid (DNA). We investigate a storage model where a data set is represented by an unordered set of M sequences, each of length L. Errors within that model are a loss of whole sequences and point errors inside the sequences, such as insertions, deletions and substitutions. We derive Gilbert-Varshamov lower bounds and sphere packing upper bounds on achievable cardinalities of error-correcting codes within this storage model. We further propose explicit code constructions than can correct errors in such a storage system that can be encoded and decoded efficiently. Comparing the sizes of these codes to the upper bounds, we show that many of the constructions are close to optimal.
Item Type: | Article | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Related URLs: |
| ||||||||||
ORCID: |
| ||||||||||
Additional Information: | © 2019 IEEE. Manuscript received December 20, 2018; revised October 31, 2019; accepted December 8, 2019. Date of publication December 20, 2019; date of current version March 17, 2020. This work was supported in part by the NSF under Grant CCF-BSF-1619053, in part by the United States–Israel BSF under Grant 2015816, and in part by the European Research Council (ERC) through the European Union’s Horizon 2020 Research and Innovation Programme under Grant 801434. This work was done in part while A. Lenz and E. Yaakobi were visiting the Center for Memory and Recording Research, University of California San Diego, which also supported the work of E. Yaakobi. This work was presented in part at the 2018 International Symposium on Information Theory, in part at the 2019 Information Theory and Applications Workshop, and in part at the 2019 Non-Volatile Memories Workshop. | ||||||||||
Funders: |
| ||||||||||
Subject Keywords: | coding over sets, DNA data storage, Gilbert-Varshamov bound, insertion and deletion errors, sphere packing bound | ||||||||||
Issue or Number: | 4 | ||||||||||
DOI: | 10.1109/tit.2019.2961265 | ||||||||||
Record Number: | CaltechAUTHORS:20200102-154127218 | ||||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechAUTHORS:20200102-154127218 | ||||||||||
Official Citation: | A. Lenz, P. H. Siegel, A. Wachter-Zeh and E. Yaakobi, "Coding Over Sets for DNA Storage," in IEEE Transactions on Information Theory, vol. 66, no. 4, pp. 2331-2351, April 2020. doi: 10.1109/TIT.2019.2961265 | ||||||||||
Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||||||||
ID Code: | 100472 | ||||||||||
Collection: | CaltechAUTHORS | ||||||||||
Deposited By: | Tony Diaz | ||||||||||
Deposited On: | 03 Jan 2020 02:41 | ||||||||||
Last Modified: | 16 Nov 2021 17:54 |
Repository Staff Only: item control page