Published March 7, 2024 | Version Submitted
Discussion Paper Open

A Foundation Model for Cell Segmentation

Abstract

Cells are a fundamental unit of biological organization, and identifying them in imaging data – cell segmentation – is a critical task for various cellular imaging experiments. While deep learning methods have led to substantial progress on this problem, most models in use are specialist models that work well for specific domains. Methods that have learned the general notion of “what is a cell” and can identify them across different domains of cellular imaging data have proven elusive. In this work, we present CellSAM, a foundation model for cell segmentation that generalizes across diverse cellular imaging data. CellSAM builds on top of the Segment Anything Model (SAM) by developing a prompt engineering approach for mask generation. We train an object detector, CellFinder, to automatically detect cells and prompt SAM to generate segmentations. We show that this approach allows a single model to achieve human-level performance for segmenting images of mammalian cells (in tissues and cell culture), yeast, and bacteria collected across various imaging modalities. We show that CellSAM has strong zero-shot performance and can be improved with a few examples via few-shot learning. We also show that CellSAM can unify bioimaging analysis workflows such as spatial transcriptomics and cell tracking. A deployed version of CellSAM is available at https://cellsam.deepcell.org/.

Copyright and License

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.

Acknowledgement

We thank Leeat Keren, Noah Greenwald, Sam Cooper, Jan Funke, Uri Manor, Joe Horsman, Michael Baym, Paul Blainey, Ian Cheeseman, Manuel Leonetti, Neehar Kondapaneni, and Elijah Cole for valuable conversations and insightful feedback. We also thank William Graf, Geneva Miller, and Kevin Yu, whose time in the Van Valen lab established the infrastructure and software tools that made this work possible. We thank Nader Khalil, Harper Carroll, Alec Fong, and the entire Brev.dev team for their support in establishing the computational infrastructure required for this work. We also thank Rosalind J. Xu and Jeffrey Moffitt for providing unpublished MERFISH data for the spatial transcriptomics workflow. We utilized images of the HeLa cell line in this research. Henrietta Lacks and the HeLa cell line established from her tumor cells without her knowledge or consent in 1951 has significantly contributed to scientific progress and advances in human health. We are grateful to Lacks, now deceased, and the Lacks family for their contributions to biomedical research. This work was supported by awards from the Shurl and Kay Curci Foundation (to DVV), the Rita Allen Foundation (to DVV), the Susan E. Riley Foundation (to DVV), the Pew-Stewart Cancer Scholars program (to DVV), the Gordon and Betty Moore Foundation (to DVV), the Schmidt Academy for Software Engineering (to SL), the Michael J. Fox Foundation through the Aligning Science Across Parkinson’s consortium (to DVV), the Heritage Medical Research Institute (to DVV), the National Institutes of Health New Innovator program (DP2-GM149556) (to DVV), the National Institutes of Health HuBMAP consortium (OT2-OD033756) (to DVV), and the Howard Hughes Medical Institute Freeman Hrabowski Scholars program (to DVV). National Institutes of Health (R01-MH123612A) (to PP). NIH/Ohio State University (R01-DC014498) (to PP). Chen Institute (to PP). The Emerald Foundation and Black in Cancer (to UI). Caltech Presidential Postdoctoral Fellowship Program (PPFP) (to UI).

Funding

This work was supported by awards from the Shurl and Kay Curci Foundation (to DVV), the Rita Allen Foundation (to DVV), the Susan E. Riley Foundation (to DVV), the Pew-Stewart Cancer Scholars program (to DVV), the Gordon and Betty Moore Foundation (to DVV), the Schmidt Academy for Software Engineering (to SL), the Michael J. Fox Foundation through the Aligning Science Across Parkinson’s consortium (to DVV), the Heritage Medical Research Institute (to DVV), the National Institutes of Health New Innovator program (DP2-GM149556) (to DVV), the National Institutes of Health HuBMAP consortium (OT2-OD033756) (to DVV), and the Howard Hughes Medical Institute Freeman Hrabowski Scholars program (to DVV). National Institutes of Health (R01-MH123612A) (to PP). NIH/Ohio State University (R01-DC014498) (to PP). Chen Institute (to PP). The Emerald Foundation and Black in Cancer (to UI). Caltech Presidential Postdoctoral Fellowship Program (PPFP) (to UI).

Contributions

UI, MM, YY, and DVV conceived the project; UI, MM, QL, YY, and DVV performed algorithm design for CellFinder and CellSAM; MM implemented the CellSAM architecture; UI, MM, and QL implemented CellFinder. UI and MM carried out the experiments and evaluations of the method. GG and PP provided input for developing CellFinder; QL and UI performed model benchmarking; QL and RD developed data pipelines, RD developed the computational infrastructure for model training; RD, EP, EP, MS, QL, CY, and EL performed data engineering; EL, CY, and UI performed CellSAM integration with bioimaging workflows. AA, MA, CB performed annotations on images for human-human comparison. RB and DVV supervised the software engineering, DVV supervised the project.

Data Availability

https://cellsam.deepcell.org/

Supplementary Material

Code Availability

https://github.com/vanvalenlab/cellsam

Conflict of Interest

David Van Valen is a co-founder and Chief Scientist of Barrier Biosciences and holds equity in the company. All other authors declare no competing interests.

Files

2023.11.17.567630v3.full.pdf

Files (62.6 MB)

Name Size Download all
md5:79f03f8572b78f9050f2f29cc074fb27
32.0 MB Preview Download
md5:cb58c8632c97f1053798452b4dbd6740
30.6 MB Preview Download

Additional details

Identifiers

Funding

Shurl and Kay Curci Foundation
Rita Allen Foundation
Pew Charitable Trusts
Gordon and Betty Moore Foundation
Schmidt Family Foundation
Schmidt Academy for Software Engineering
Michael J. Fox Foundation
Aligning Science Across Parkinson's
California Institute of Technology
Heritage Medical Research Institute
National Institutes of Health
DP2-GM149556
National Institutes of Health
OT2-OD033756
Howard Hughes Medical Institute
Freeman Hrabowski Scholars
National Institutes of Health
R01-MH123612A
National Institutes of Health
R01-DC014498
California Institute of Technology
Tianqiao and Chrissy Chen Institute for Neuroscience
The Emerald Foundation
California Institute of Technology
Presidential Postdoctoral Fellowship

Caltech Custom Metadata

Caltech groups
Division of Biology and Biological Engineering (BBE), Tianqiao and Chrissy Chen Institute for Neuroscience, Earthquake Engineering Research Laboratory, Heritage Medical Research Institute