Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published March 26, 2024 | Submitted
Discussion Paper Open

Algorithms for a Commons Cell Atlas

  • 1. ROR icon California Institute of Technology

Abstract

Cell atlas projects curate representative datasets, cell types, and marker genes for tissues across an organism. Despite their ubiquity, atlas projects rely on duplicated and manual effort to curate marker genes and annotate cell types. The size of atlases coupled with a lack of data-compatible tools make reprocessing and analysis of their data near-impossible. To overcome these challenges, we present a collection of data, algorithms, and tools to automate cataloging and analyzing cell types across tissues in an organism, and demonstrate its utility in building a human atlas.

Copyright and License

 
 

Acknowledgement

The authors acknowledge the Howard Hughes Medical Institute for funding A.S.B. through the Hanna H. Gray Fellows program. Thanks to the Caltech Bioinformatics Resource Center for assisting with pre-processing the data.

Contributions

The CCA atlas concept emerged from an initiative by ASB to uniformly preprocess the datasets in (Svensson, da Veiga Beltrame, and Pachter 2020). ÁGM conceived of the idea of examining the OAS1 isoforms at single-cell resolution across human tissues after the publication of (Zhou et al. 2021). ASB conceived the CCA structure and associated mx and ec toolkit with feedback from LP. ÁGM pre-processed the CCA datasets, and ASB and ÁGM wrote mx and ec. ÁGM and ASB developed the CCA quality control. ÁGM led the OAS1 analysis, with help from ASB and LP. ASB developed the ‘mx assign’ cell assignment approach, and ÁGM and ASB benchmarked it. ASB drafted the initial version of the manuscript, which was edited and reviewed by all authors.

Data Availability

The code and data needed to reprocess the results of this manuscript can be found here https://github.com/pachterlab/BGP_2024/. The CCA atlas can be found here https://github.com/cellatlas/human/. A summary of the datasets in the CCA atlas can be found here https://cellatlas.github.io/human/.

Code Availability

The code and data needed to reprocess the results of this manuscript can be found here https://github.com/pachterlab/BGP_2024/. The CCA atlas can be found here https://github.com/cellatlas/human/. A summary of the datasets in the CCA atlas can be found here https://cellatlas.github.io/human/.

Conflict of Interest

The authors have declared no competing interest.

Files

2024.03.23.586413v1.full.pdf
Files (4.3 MB)
Name Size Download all
md5:7cb3487d8fdd4d66e56f0a57389c4e3d
1.9 MB Preview Download
md5:8a15da06297c8e4a1a621dec2ac543f9
2.1 MB Preview Download
md5:09137edf4e34c6a77cf07324ccaae434
195.0 kB Preview Download
md5:b3062314573316f91689c63be9fff708
8.0 kB Download

Additional details

Created:
May 6, 2024
Modified:
May 6, 2024