Algorithms for a Commons Cell Atlas
Abstract
Cell atlas projects curate representative datasets, cell types, and marker genes for tissues across an organism. Despite their ubiquity, atlas projects rely on duplicated and manual effort to curate marker genes and annotate cell types. The size of atlases coupled with a lack of data-compatible tools make reprocessing and analysis of their data near-impossible. To overcome these challenges, we present a collection of data, algorithms, and tools to automate cataloging and analyzing cell types across tissues in an organism, and demonstrate its utility in building a human atlas.
Copyright and License
Acknowledgement
The authors acknowledge the Howard Hughes Medical Institute for funding A.S.B. through the Hanna H. Gray Fellows program. Thanks to the Caltech Bioinformatics Resource Center for assisting with pre-processing the data.
Contributions
The CCA atlas concept emerged from an initiative by ASB to uniformly preprocess the datasets in (Svensson, da Veiga Beltrame, and Pachter 2020). ÁGM conceived of the idea of examining the OAS1 isoforms at single-cell resolution across human tissues after the publication of (Zhou et al. 2021). ASB conceived the CCA structure and associated mx and ec toolkit with feedback from LP. ÁGM pre-processed the CCA datasets, and ASB and ÁGM wrote mx and ec. ÁGM and ASB developed the CCA quality control. ÁGM led the OAS1 analysis, with help from ASB and LP. ASB developed the ‘mx assign’ cell assignment approach, and ÁGM and ASB benchmarked it. ASB drafted the initial version of the manuscript, which was edited and reviewed by all authors.
Data Availability
The code and data needed to reprocess the results of this manuscript can be found here https://github.com/pachterlab/BGP_2024/. The CCA atlas can be found here https://github.com/cellatlas/human/. A summary of the datasets in the CCA atlas can be found here https://cellatlas.github.io/human/.
Code Availability
The code and data needed to reprocess the results of this manuscript can be found here https://github.com/pachterlab/BGP_2024/. The CCA atlas can be found here https://github.com/cellatlas/human/. A summary of the datasets in the CCA atlas can be found here https://cellatlas.github.io/human/.
Conflict of Interest
The authors have declared no competing interest.
Files
Additional details
- Howard Hughes Medical Institute
- Caltech groups
- Division of Biology and Biological Engineering, Tianqiao and Chrissy Chen Institute for Neuroscience