of 15
Methods in Brief
Accurate single-molecule spot detection for image-
based spatial transcriptomics with weakly
supervised deep learning
Graphical abstract
Highlights
d
Polaris is a unified analysis pipeline for spatial
transcriptomics data
d
Polaris detects fluorescent spots in unseen images with
minimal parameter tuning
d
Polaris integrates spot detection with cell segmentation and
gene decoding
d
Polaris generalizes across sample types, imaging modalities,
and assay platforms
Authors
Emily Laubscher, Xuefei Wang (Julie),
Nitzan Razin, ..., Jeffrey R. Moffitt,
Yisong Yue, David Van Valen
Correspondence
vanvalen@caltech.edu
In brief
Polaris is a deep-learning-based analysis
pipeline for spatial transcriptomics
images. Polaris integrates a weakly
supervised deep-learning model for spot
detection with cell segmentation and
gene decoding to quantify single-cell
gene expression. It is a turnkey solution
for a variety of sample types and imaging
assays.
Laubscher et al., 2024, Cell Systems
15
, 475–482
May 15, 2024
ª
2024 Published by Elsevier Inc.
https://doi.org/10.1016/j.cels.2024.04.006
ll
Methods in Brief
Accurate single-molecule spot detection
for image-based spatial transcriptomics
with weakly supervised deep learning
Emily Laubscher,
1
Xuefei Wang (Julie),
2
Nitzan Razin,
2
Tom Dougherty,
2
Rosalind J. Xu,
3
,
4
,
5
Lincoln Ombelets,
1
Edward Pao,
2
William Graf,
2
Jeffrey R. Moffitt,
3
,
4
,
6
Yisong Yue,
7
and David Van Valen
2
,
8
,
9
,
*
1
Division of Chemistry and Chemical Engineering, Caltech, Pasadena, CA 91125, USA
2
Division of Biology and Biological Engineering, Caltech, Pasadena, CA 91125, USA
3
Program in Cellular and Molecular Medicine, Boston Children’s Hospital, Boston, MA 02115, USA
4
Department of Microbiology, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
5
Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02115, USA
6
Broad Institute of Harvard and MIT, Cambridge, MA, USA
7
Division of Computational and Mathematical Sciences, Caltech, Pasadena, CA 91125, USA
8
Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
9
Lead contact
*Correspondence:
vanvalen@caltech.edu
https://doi.org/10.1016/j.cels.2024.04.006
SUMMARY
Image-based spatial transcriptomics methods enable transcriptome-scale gene expression measurements
with spatial information but require complex, manually tuned analysis pipelines. We present Polaris, an anal-
ysis pipeline for image-based spatial transcriptomics that combines deep-learning models for cell segmen-
tation and spot detection with a probabilistic gene decoder to quantify single-cell gene expression accu-
rately. Polaris offers a unifying, turnkey solution for analyzing spatial transcriptomics data from
multiplexed error-robust FISH (MERFISH), sequential fluorescence
in situ
hybridization (seqFISH), or
in
situ
RNA sequencing (ISS) experiments. Polaris is available through the DeepCell software library (https://
github.com/vanvalenlab/deepcell-spots) and https://www.deepcell.org.
INTRODUCTION
Advances in spatial transcriptomics have enabled system-level
gene expression measurement while preserving spatial informa-
tion, enabling new studies into the connections between gene
expression, tissue organization, and disease states.
1
,
2
Spatial
transcriptomics methods fall broadly into two categories.
Sequencing-based methods leverage arrays of spatially bar-
coded RNA capture beads to integrate spatial information and
transcriptomes.
3–6
Image-based methods, including multi-
plexed RNA fluorescent
in situ
hybridization (FISH) and
in situ
RNA sequencing (ISS), perform sequential rounds of fluorescent
staining to label transcripts to measure the expression of thou-
sands of genes in the same sample.
7–11
Because these methods
rely on imaging, the data that they generate naturally contain the
sample’s spatial organization. Although image-based spatial
transcriptomics enables measurements with high transcript
recall and subcellular resolution,
1
,
2
rendering the raw imaging
data interpretable remains challenging. Specifically, the com-
puter vision pipelines for image-based spatial transcriptomics
must reliably perform cell segmentation, spot detection, and
gene assignment across diverse imaging data. Prior methods
that sought an integrated solution to this problem relied on
manually tuned algorithms to optimize performance for a partic-
ular sample or spatial transcriptomics assay.
12
,
13
Thus, there re-
mains a need for an integrated, open-source pipeline that can
perform these steps reliably across the diverse images gener-
ated by spatial transcriptomics assays with minimal human
intervention.
Deep-learning methods are a natural fit for this problem. Prior
work by us and others has shown that deep-learning methods
can accurately perform cell segmentation with minimal user
intervention,
14–17
providing a key computational primitive for
cellular image analysis. Here, we focus on the problem of spot
detection for image-based spatial transcriptomics data. Existing
spot detection methods fall into two categories: ‘‘classical’’ and
‘‘supervised.’’
18
Classical methods are widely used but require
manual parameter fine-tuning to optimize performance.
19
,
20
The optimal parameter values are often different within regions
of the same image, making implementation of classical methods
time-intensive and fundamentally limiting their scalability. Super-
vised methods,
21–23
which often rely on deep-learning method-
ologies, learn how to detect spots from labeled training data.
These methods eliminate the need for manual parameter tuning
to optimize spot detection performance. However, the require-
ment for labeled training data presents a major challenge
Cell Systems
15
, 475–482, May 15, 2024
ª
2024 Published by Elsevier Inc.
475
ll
because experimentally generated data contain too many spots
for manual annotation to be feasible. Training data derived from
classical algorithms are limited by the characteristics of those al-
gorithms, imposing a ceiling on model performance. Further,
simulated training data lack the artifacts present in experimen-
tally generated data, which can limit the model’s performance
on real data.
In this work, we combine deep learning with a weakly su-
pervised training data construc
tion scheme to create a univer-
sal spot detector for image-based spatial transcriptomics
data. We demonstrate the performance of our spot detection
model on simulated and experimentally generated images.
Given that training deep-learning models with weak supervi-
sion can yield a computational primitive for spot detection,
we then constructed Polaris, a
n integrated deep-learning
pipeline for image-based spatial transcriptomics. Constructed
in this fashion, Polaris offers a turnkey analysis solution for
data from various image-based spatial transcriptomics
methods while removing the need for manual parameter tun-
ing or extensive user expertise.
RESULTS
Here, we describe two key aspects of Polaris’ spot detection
model—constructing consensus training data with annotations
from multiple classical spot detection algorithms and deep-
learning model design and training.
Accurate training data are an essential component of every
deep-learning method. In this work, we have sought to create
training data for spot detection models by finding consensus
among several commonly used classical spot detection algo-
rithms (
Figure 1
A). In our approach, we first create annotations
for representative fluorescent spot images by manually fine-tun-
ing a collection of classical algorithms on each image. We refer to
these algorithms as ‘‘annotators.’’ This process generates con-
flicting sets of coordinate spot locations because each annotator
detects or misses different sets of spots. To determine which an-
notators detected each spot in an image, the detections from all
annotators are clustered based on their proximity. Inspired by
prior work on programmatic labeling,
24
we then de-noise con-
flicting spot annotations with a generative model. The generative
model characterizes annotators with two parameters: (1) true-
positive rate (TPR), which is an annotator’s probability of detect-
ing a ground-truth true spot, and (2) false-positive rate (FPR),
which is an annotator’s probability of detecting a ground-truth
false spot. The model characterizes clustered spots by their
probability of corresponding to a ground-truth true spot (
p
(TP)).
The generative model is given an initial guess for the TPR and
FPR of each classical algorithm and a matrix of annotation
data, which we name the ‘‘detection information matrix.’’ This
matrix of annotation data,
x
=
f
x
ic
i
=
1
;
.
;
n
g
consists of binary vari-
ables,
x
ic
, which are equal to 1 if annotator
i
detected cluster
c
,
and 0 if not. The model is then fit with expectation maximization
(EM)
25
by iteratively calculating the
p
(TP) of each cluster and esti-
mating the TPRs and FPRs of each annotator until convergence
(
Figure 1
D). The remainder of this section provides more detail on
the mathematical execution of these steps.
Our generative model assumes that each annotator
i
pro-
duces Bernoulli-distributed annotations. We define
z
c
to be a
binary variable indicating whether cluster
c
is a true spot or
not (1 if true and 0 if not).
p
ð
z
c
Þ
—conditioned on the data and
annotator characteristics—corresponds to
p
(TP), which we
wish to compute. Let us also define
q
i
ð
z
c
Þ
to be a variable
that represents an annotator
i
’s probability of detecting a clus-
ter, conditioned on whether the cluster is a true spot or not.
This notation is a more compact way of representing the anno-
tator characteristics, as
q
i
ð
1
Þ
=
TPR
i
and
q
i
ð
0
Þ
=
FPR
i
. For
every cluster
c
and annotator
i
, the distribution of
x
ic
given
the cluster assignment
z
c
and annotator characteristics
q
i
ð
z
c
Þ
is a Bernoulli distribution:
P

x
ic



z
c
;
q
i
ð
z
c
Þ

=
q
i
ð
z
c
Þ
x
ic
ð
1

q
i
ð
z
c
ÞÞ
1

x
ic
:
(Equation 1)
We assume the variables
x
ic
are independent; the probability
to observe the data
x
=
f
x
ic
g
, given
f
q
i
g
and
f
z
c
g
is then
P
ðf
x
ic
gjf
z
c
g
;
f
q
i
=
Y
i
;
c
q
i
ð
z
c
Þ
x
ic
ð
1

q
i
ð
z
c
ÞÞ
1

x
ic
:
(Equation 2)
To offer a concrete example of this formula in action, consider
the following situations for a hypothetical set of three annotators.
For a ‘‘true detection’’ (e.g.,
z
c
=
1), the probability that all three
annotators detect the spot (e.g.,
x
ic
=
1 for all
i
) given by the
above formula reduces to
p
ð
x
i
;
c


z
c
;
q
Þ
=
Y
3
i
=
1
TPR
i
;
(Equation 3)
which is simply the product of the TPRs of each annotator. Alter-
natively, the probability that the first two annotators detect a
ground-truth true spot while the third annotator (incorrectly)
does not is given by:
p
ð
x
i
;
c


z
c
;
q
Þ
=
TPR
1
$
TPR
2
$
ð
1

TPR
3
Þ
:
(Equation 4)
We utilize
Equation 2
with the EM algorithm to infer the anno-
tator and cluster characteristics. The EM algorithm consists of
two computation steps: an expectation step and a maximization
step. To perform the expectation step, we define the probability
of a cluster corresponding to a true or false detection with Bayes’
theorem:
p
ð
z
c
j
x
ic
;
q
Þ
=
p
ð
x
ic
j
z
c
;
q
Þ
p
ð
z
c
Þ
p
ð
x
ic
j
q
Þ
:
(Equation 5)
The term
p
ð
z
c
Þ
is the prior probability of a cluster correspond-
ing to a true or false detection; we use the least informative value
for the prior by setting
p
ð
z
c
Þ
=
1
=
2, indicating an equal probabil-
ity that a spot is a true or false detection. The term
p
ð
x
ic
j
q
Þ
can be
expressed as follows:
p
ð
x
ic
j
q
Þ
=
X
Z
p
ð
x
ic
j
z
c
;
q
Þ
p
ð
z
c
Þ
:
(Equation 6)
Therefore, the probability
p
ð
z
c
j
x
ic
;
q
Þ
can be expressed as the
likelihood of each possible label (e.g., true or false) normalized by
the sum of the likelihood of both labels:
p
ð
z
c
j
x
ic
;
q
Þ
=
p
ð
x
ic
j
z
c
;
q
Þ
p
ð
z
c
Þ
P
z
p
ð
x
ic
j
z
c
;
q
Þ
p
ð
z
c
Þ
;
(Equation 7)
ll
Methods in Brief
476
Cell Systems
15
, 475–482, May 15, 2024
Figure 1. A weakly supervised deep-learning framework for accurate fluorescent spot detection of spatial transcriptomics imaging data
(A) Training data generation for spot detection. Spot labels were generated by finding consensus among a panel of commonly used classical spot detecti
on
algorithms through generative modeling. These consensus labels were then used to train Polaris’ spot detection model. Sequential steps are linked w
ith an arrow;
associated methods and data types are linked with a solid line.
(B) Demonstration of the training data generation for an example spot image. Spot locations are converted into encoded detections and distance maps,
which
guide the classification and regression tasks performed during model training. Spot colors correspond to the annotation colors in (A).
(C) Output of Polaris’ spot detection model for an example seqFISH image. Regression values above a default threshold are set to zero. The regression i
mages in
(B) and (C) are the sum of the squared pixel-wise regression in the x and y directions.
(D) Schematic diagram of the EM method to fit the generative model for consensus spot annotation creation.
ll
Methods in Brief
Cell Systems
15
, 475–482, May 15, 2024
477
=
p
ð
x
ic
j
z
c
;
q
Þ
$
1
2
P
z
p
ð
x
ic
j
z
c
;
q
Þ
$
1
2
;
(Equation 8)
=
p
ð
x
ic
j
z
c
;
q
Þ
P
z
p
ð
x
ic
j
z
c
;
q
Þ
:
(Equation 9)
Using this method to calculate
p
ð
z
c
j
x
ic
;
q
Þ
, we can then calcu-
late
E
ð
TP
Þ
,
E
ð
FN
Þ
,
E
ð
FP
Þ
, and
E
ð
TN
Þ
for each annotator. Two sce-
narios can arise when calculating these values. (1) If an annotator
detects a spot in a particular cluster, i.e.
x
ic
=
1,
E
ð
TP
Þ
for that
annotator is equal to
p
ð
TP
Þ
=
p
ð
z
c
j
x
ic
;
q
Þ
for that cluster, and
E
ð
FP
Þ
for that annotator is equal to
p
ð
FP
Þ
for that cluster.
E
ð
TN
Þ
and
E
ð
FN
Þ
are set to zero. (2) If an annotator does not
detect a spot in a particular cluster, i.e.,
x
ic
=
0,
E
ð
TN
Þ
for that
annotator is equal to
p
ð
FP
Þ
=
1

p
ð
TP
Þ
for that cluster, and
E
ð
FN
Þ
for that annotator is equal to
p
ð
TP
Þ
for that cluster.
E
ð
TP
Þ
and
E
ð
FP
Þ
are set to zero.
To perform the maximization step, we sum
E
ð
TP
Þ
,
E
ð
FN
Þ
,
E
ð
FP
Þ
, and
E
ð
TN
Þ
across all clusters to calculate an updated
maximum likelihood estimate for
TPR
i
and
FPR
i
for method
i
with equations of the following form:
TPR
i
=
P
i
E
ð
TP
i
Þ
P
i
E
ð
TP
i
Þ
+
P
i
E
ð
FN
i
Þ
(Equation 10)
FPR
i
=
P
i
E
ð
FP
i
Þ
P
i
E
ð
FP
i
Þ
+
P
i
E
ð
TN
i
Þ
(Equation 11)
The expectation and maximization steps are performed itera-
tively until the values for
p
(TP)
c
, TPR
i
, and FPR
i
converge. The
consensus spot locations are taken at the centroid of the clusters
with a
p
(TP)
c
value that exceeds a defined probability threshold.
We demonstrated that this method yields an accurate estimation
of TPR and FPR and that the resulting spot labels approach
100% correct as dataset size and number of annotators increase
(
Figure S1
).
To construct a training dataset for Polaris’ spot detection
model, we applied this consensus training data construction
method to images from various spatial transcriptomics assays.
We assembled a set of representative images from sequential
fluorescence
in situ
hybridization (seqFISH) images, multiplexed
error-robust FISH (MERFISH) images,
26
,
27
and images of spviral
RNA labeled with a repeating peptide array known as SunTag.
28
We use these consensus annotations to train a deep-learning
model for spot detection.
To train our spot detection model, we frame the problem as a
classification and regression task (
Figure 1
B). For each pixel, we
seek to predict whether that pixel contains a spot and compute
the distance from the pixel to the nearest spot centroid. To train
our model with our consensus spot labels, the coordinate spot
locations are converted into two image types: (1) an image con-
taining one-hot-encoded spot locations and (2) regression im-
ages encoding the sub-pixel distance to the nearest spot in the
x and y directions (
Figure 1
B). The deep-learning model returns
two output images from a given input image: (1) the pixel-wise
probability of a spot and (2) x- and y-regression images (
Fig-
ure 1
C). Our model is trained with a weighted cross entropy
and a custom mean squared error loss (e.g., computed only in
a neighborhood around each spot) for the two output images.
For our model architecture, we utilize FeatureNets, a family of
models that are parameterized by their receptive field.
14
We
perform hyperparameter optimization experiments to find the
optimal receptive field size of 13 (
Figure S5
). To return the loca-
tion of the spots in an image, we use maximum intensity filtering
to detect the local maxima in the spot probability image and use
the regression images to update the coordinates of each spot to
achieve sub-pixel resolution.
We demonstrate Polaris’ spot detection capabilities on held-
out experimentally generated images. Visual inspection showed
that our model generalized to out-of-distribution, spot-like data
generated by various spatial transcriptomics assays, such as
ISS
7
and split-probe multiplexed FISH (split-FISH) images
11
(
Fig-
ure S2
). Additionally, we used held-out images to quantify the
agreement between Polaris and the classical methods used to
create our consensus training data. Agreement between sets
of detected spots was determined with a mutual nearest neigh-
bors matching method (
Figure S3
). We observed higher agree-
ment between Polaris and the classical methods than exists
among the classical methods themselves. This analysis demon-
strates Polaris’ learning of consensus labels generalizes to im-
ages held-out from the training dataset (
Figure S4
).
The ambiguity of ground-truth annotations for experimental
data presents challenges for quantitatively benchmarking spot
detection methods. To evaluate the accuracy of our approach,
we followed prior work by simulating spot images, which have
unambiguous ground-truth spot locations.
29
,
30
When accurate
simulation of experimental data is possible, simulations remove
the need for unambiguous ground-truth annotations for bench-
marking. We note that our spot simulations add signal on top
of autofluorescence images. Because we control the image gen-
eration, we can explore model performance as a function of im-
age difficulty by tuning parameters, such as the spot density and
signal-to-noise ratio. Benchmarking on simulated data demon-
strated that our method outperforms models trained with either
simulated data or data labeled with a single classical algorithm.
We found that this performance gap held across the tested range
of spot intensity and density (
Figures S6
A and S6B). We
concluded that the consensus annotations more accurately cap-
ture the ground-truth locations of spots in training images than
any single classical algorithm and that there is significant value
to training with experimentally generated images. We also found
that Polaris’ spot detection model outperforms other recently
published spot detection methods when evaluated on these
simulated data, demonstrating greater robustness to ranging
spot density and signal-to-noise ratios (
Figures S6
C and S6D).
The combination of benchmarking on simulated data, visual in-
spection, and analysis of inter-algorithm agreement led us to
conclude that Polaris can accurately perform spot detection on
a diverse array of challenging single-molecule images.
Polaris packages this model into an analysis pipeline imple-
mented in Python for multiplexed spatial transcriptomics data.
Polaris integrates multiple analysis steps to yield coordinate spot
locations with assigned gene identities (
Figure 2
A). First, it utilizes
classical computer vision methods for image alignment and per-
forms spot detection on images across all staining rounds. Then,
cell segmentation is performed with models from DeepCell
ll
Methods in Brief
478
Cell Systems
15
, 475–482, May 15, 2024
Figure 2. Polaris produces single-cell, spatial gene expression maps for multiplex spatial transcriptomics images
(A and B) Analysis steps for Polaris for singleplex (red) and multiplex (blue) spatial transcriptomics imaging data. Sequential steps are linked wit
h an arrow, and
associated methods and data types are linked with a solid line. Deep-learning models perform spot detection and cell segmentation, whereas a probabi
listic
graphical model infers gene identities.
(C) A probabilistic graphical model for inferring gene identities from spot detections. This model consists of a mixture of
K
relaxed Bernoulli distributions,
parameterized by their probability,
a
, and their temperature,
l
, for generating observed data,
x
, sample size of
n
spots. Shaded vertices represent observed
random variables, empty vertices represent latent random variables, edges signify conditional dependency, rectangles (‘‘plates’’) represent in
dependent
replication, and small solid dots represent deterministic parameters.
(D) Spatial organization of marker gene locations in a mouse ileum tissue sample. Each spot corresponds to a decoded transcript for a cell type marker g
ene.
Whole-cell segmentation was performed with Mesmer.
16
(E–G) Locations of decoded genes in an example Goblet cell, enterocyte, and B cell, respectively.
ll
Methods in Brief
Cell Systems
15
, 475–482, May 15, 2024
479
softwarelibrary.
16
Polaris’spotdetectionmodelpredictsthepixel-
wise spot probability for each imaging round. For multiplexed
spatial transcriptomics images, Polaris considers a codebook of
up to thousands of barcodes that define the rounds and colors
of fluorescent staining for each gene. To assign gene identities
for barcoded spatial transcriptomics images, we fit a graphical
model of a mixture of relaxed Bernoulli distributions to the pixel-
wise probability values with variational inference
31
,
32
(
Figures 2
B
and 2C). This model estimates the characteristic relaxed Bernoulli
distributions of pixel values for ‘‘spots’’ and ‘‘background’’ in each
imaging round, which may vary due to factors such as fluorophore
identity,stainingefficiency,orimagenormalization.Thesedistribu-
tions are used in combination with the experimental codebook to
estimate the probability of each barcode identity and ultimately
assign spots to a gene identity or background.
As with our benchmarking of spot detection methods, we used
simulated data to benchmark the performance of our barcode
assignment method quantitatively. Here, simulated data allowed
us to explore our method’s dependency on spot dropout. This
event can occur due to labeling failure, image quality, or failure
in spot detection. Regardless of origin, the presence of dropout
imposes a robustness constraint on the gene decoding method-
ology because decoding schemes robust to dropout would bet-
ter tolerate labeling and spot detection model failures. Our
benchmarking of spot decoding with simulated data demon-
strates that decoding with a generative model based on the
relaxed Bernoulli distribution was more robust to dropout than
other benchmarked methods (
Figure S7
).
We demonstrated Polaris’ performance on a variety of previ-
ously published data: a MERFISH experiment in a mouse ileum
tissue sample
27
(
Figures 2
D–2F), a MERFISH experiment in a
mouse kidney tissue sample
33
(
Figure S8
A), a seqFISH experi-
ment in cultured macrophages (
Figure S8
B), and an ISS experi-
ment of a pooled CRISPR library in HeLa cells
34
(
Figure S10
). We
found that Polaris detected marker genes from expected cell
types—even in areas with high cell density and heterogeneous
cell morphologies in tissue samples. These results highlight the
power of spatial transcriptomics methods to quantify gene
expression while retaining multicellular and subcellular spatial
organization.
For both tissue MERFISH datasets, we found that Polaris’
output gene expression counts have similar correlations with
bulk sequencing data as the original analyses (r = 0.796 and
r = 0.683; r = 0.537 and r = 0.565), and for both datasets, the
two analysis outputs were highly correlated (r = 0.936; r =
0.911) (
Figures S9
A–S9D). For the cell culture seqFISH dataset,
Polaris’ output has a similar correlation with bulk sequencing
data as the output of the original analysis tool (r = 0.809 and
r = 0.694), and the two outputs are highly correlated (r = 0.910)
(
Figures S9
E and S9F). For the ISS dataset, the barcode counts
quantified with Polaris and the original analysis
34
are highly
correlated (r = 0.946) with Polaris consistently yielding higher
counts (
Figure S10
). Spatial transcriptomics methods often
encounter overdispersion in measuring gene expression counts,
potentially limiting the efficacy of comparing these counts using
a linear regression model.
33
,
35
Despite this limitation, these re-
sults demonstrate that Polaris can generalize across sample
types, imaging platforms, and spatial transcriptomics assays
without manual parameter tuning.
DISCUSSION
We sought to create a key computational primitive for spot
detection and an integrated, open-source pipeline for image-
based spatial transcriptomics. Our weakly supervised deep-
learning model for spot detection provides a universal spot
detection method for image-based spatial transcriptomics
data. Our training data generation methodology effectively
tackles a fundamental data engineering challenge in generating
annotations for supervised spot detection methods, surpassing
the performance achieved by using simulated data or a single
classical method. Polaris packages this model and others into
a unified pipeline that takes users from raw data to interpretable
spatial gene expression maps with single-cell resolution. We
believe that Polaris will help standardize the computational
aspect of image-based spatial transcriptomics, reduce the
amount of time required to go from raw data to insights and facil-
itate scaling analyses to larger datasets. Polaris’ outputs are
compatible with downstream bioinformatics tools, such as
Squidpy
36
and Seurat.
37
Polaris is available for academic use
through the DeepCell software library
https://github.com/
vanvalenlab/deepcell-spots
and as a Python package distrib-
uted on PyPI
https://pypi.org/project/DeepCell-Spots/
. A single-
plex deployment of the pipeline is available through the DeepCell
web portal
https://deepcell.org
.
STAR
+
METHODS
Detailed methods are provided in the online version of this paper and include
the following:
d
KEY RESOURCES TABLE
d
RESOURCE AVAILABILITY
B
Lead contact
B
Materials availability
B
Data and code availability
d
EXPERIMENTAL MODEL AND STUDY PARTICIPANT DETAILS
B
Cell lines and cell culture
d
METHOD DETAILS
B
Generation of seqFISH spot training data
B
Multiplexed seqFISH in cultured macrophages
d
QUANTIFICATION AND STATISTICAL ANALYSIS
B
Construction of spot training data
B
Model benchmarking with simulated data
B
Spot detection model architecture
B
Multiplex FISH analysis pipeline
B
Spatial Genomics seqFISH image analysis
SUPPLEMENTAL INFORMATION
Supplemental information can be found online at
https://doi.org/10.1016/j.
cels.2024.04.006
.
ACKNOWLEDGMENTS
We thank Lior Pachter, Barbara Englehardt, Sami Farhi, Ross Barnowski, and
the other members of the Van Valen lab for useful feedback and interesting dis-
cussions. We thank Nico Pierson and Jonathan White for contributing data and
providing early annotations. The HeLa cell line was used in this research. Hen-
rietta Lacks and the HeLa cell line established from her tumor cells without her
knowledge or consent in 1951 have made significant contributions to scientific
progress and advances in human health. We are grateful to Henrietta Lacks,
now deceased, and her surviving family members for their contributions to
biomedical research. This work was supported by awards from the Shurl
ll
Methods in Brief
480
Cell Systems
15
, 475–482, May 15, 2024
and Kay Curci Foundation (to D.V.V.), the Rita Allen Foundation (to D.V.V.), the
Susan E Riley Foundation (to D.V.V.), the Pew-Stewart Cancer Scholars pro-
gram (to D.V.V.), the Gordon and Betty Moore Foundation (to D.V.V.), the
Schmidt Academy for Software Engineering (to T.D.), the Michael J. Fox Foun-
dation through the Aligning Science Across Parkinsons consortium (to D.V.V.),
the Heritage Medical Research Institute (to D.V.V.), the NIH New Innovator pro-
gram (DP2-GM149556) (to D.V.V.), and an HHMI Freeman Hrabowski Scholar
award (to D.V.V.).
AUTHOR CONTRIBUTIONS
E.L., N.R., and D.V.V.conceived the project. E.L., N.R., and D.V.V.conceived the
weakly supervised deep-learning method for spot detection. E.L. and E.P.
created the seqFISH training data for the spot detection model. L.O. contributed
tothe seqFISH protocol used to create training data. E.L. developed software for
trainingdataannotation. E.L.curated and annotated the trainingdatafor the spot
detectionmodel.N.R.and E.L.developed the spotdetectionmodel trainingsoft-
ware. E.L. trained the models. E.L. and N.R. developed the metrics software for
the spot detection model. X.W., E.L., Y.Y., and D.V.V. conceived the combinato-
rial barcode assignment method. E.L. and X.W. developed the barcode assign-
ment software, with input from Y.Y. and D.V.V. E.L. developed the multiplex im-
age analysis pipeline. E.L. and X.W. benchmarked the multiplex image analysis
pipeline. E.L. and T.D. developed the cloud deployment. R.J.X. and J.R.M.
collected and analyzed MERFISH data. E.L. and E.P. created the macrophage
seqFISH dataset. W.G. and D.V.V. oversaw software engineering for Polaris.
Y.Y. and D.V.V. oversaw the algorithm development for the project. E.L. and
D.V.V. wrote the manuscript, with input from all authors. D.V.V. supervised the
project.
DECLARATION OF INTERESTS
D.V.V. is a co-founder of Barrier Biosciences and holds equity in the company.
D.V.V., E.L., and N.R. filed a patent for weakly supervised deep learning for
spot detection. J.R.M. is co-founder and scientific advisor to Vizgen and holds
equity in the company. J.R.M. is an inventor of patents related to MERFISH
filed on his behalf by Harvard University and Boston Children’s Hospital.
REFERENCES
1. Asp, M., Bergenstra
̊
hle, J., and Lundeberg, J. (2020). Spatially Resolved
Transcriptomes-Next Generation Tools for Tissue Exploration.
BioEssays
42
, e1900221.
https://doi.org/10.1002/bies.201900221
.
2. Zhang, L., Chen, D., Song, D., Liu, X., Zhang, Y., Xu, X., and Wang, X.
(2022). Clinical and translational values of spatial transcriptomics. Signal
Transduct. Target. Ther.
7
, 111.
https://doi.org/10.1038/s41392-022-
00960-w
.
3. Sta
̊
hl, P.L., Salme
́
n, F., Vickovic, S., Lundmark, A., Navarro, J.F.,
Magnusson, J., Giacomello, S., Asp, M., Westholm, J.O., Huss, M.,
et al. (2016). Visualization and analysis of gene expression in tissue sec-
tions by spatial transcriptomics. Science
353
, 78–82.
https://doi.org/10.
1126/science.aaf2403
.
4. Vickovic, S., Eraslan, G., Salme
́
n, F., Klughammer, J., Stenbeck, L.,
Schapiro, D., A
̈
ijo
̈
, T., Bonneau, R., Bergenstra
̊
hle, L., Navarro, J.F.,
et al. (2019). High-definition spatial transcriptomics for in situ tissue
profiling. Nat. Methods
16
, 987–990.
https://doi.org/10.1038/s41592-
019-0548-y
.
5. Rodriques, S.G., Stickels, R.R., Goeva, A., Martin, C.A., Murray, E.,
Vanderburg, C.R., Welch, J., Chen, L.M., Chen, F., and Macosko, E.Z.
(2019). Slide-seq: A scalable technology for measuring genome-wide
expression at high spatial resolution. Science
363
, 1463–1467.
https://
doi.org/10.1126/science.aaw1219
.
6. Stickels, R.R., Murray, E., Kumar, P., Li, J., Marshall, J.L., Di Bella, D.J.D.,
Arlotta, P., Macosko, E.Z., and Chen, F. (2021). Highly sensitive spatial
transcriptomics at near-cellular resolution with slide-seqV2. Nat.
Biotechnol.
39
, 313–319.
https://doi.org/10.1038/s41587-020-0739-1
.
7. Ke, R., Mignardi, M., Pacureanu, A., Svedlund, J., Botling, J., W
ahlby, C.,
and Nilsson, M. (2013). In situ sequencing for RNA analysis in preserved
tissue and cells. Nat. Methods
10
, 857–860.
https://doi.org/10.1038/
nmeth.2563
.
8. Chen, K.H., Boettiger, A.N., Moffitt, J.R., Wang, S., and Zhuang, X. (2015).
RNA imaging. Spatially resolved, highly multiplexed RNA profiling in single
cells. Science
348
, aaa6090.
https://doi.org/10.1126/science.aaa6090
.
9. Codeluppi, S., Borm, L.E., Zeisel, A., La Manno, G., van Lunteren, J.A.,
Svensson, C.I., and Linnarsson, S. (2018). Spatial organization of the so-
matosensory cortex revealed by osmFISH. Nat. Methods
15
, 932–935.
https://doi.org/10.1038/s41592-018-0175-z
.
10. Eng, C.L., Lawson, M., Zhu, Q., Dries, R., Koulena, N., Takei, Y., Yun, J.,
Cronin, C., Karp, C., Yuan, G.-C., and Cai, L. (2019). Transcriptome-scale
super-resolved imaging in tissues by rna seqfish+. Nature
568
, 235–239.
https://doi.org/10.1038/s41586-019-1049-y
.
11. Goh, J.J.L., Chou, N., Seow, W.Y., Ha, N., Cheng, C.P.P., Chang, Y.C.,
Zhao, Z.W., and Chen, K.H. (2020). Highly specific multiplexed RNA imag-
ing in tissues with split-FISH. Nat. Methods
17
, 689–693.
https://doi.org/
10.1038/s41592-020-0858-0
.
12. Axelrod, S., Cai, M., Carr, A., Freeman, J., Ganguli, D., Kiggins, J., Long,
B., Tung, T., and Yamauchi, K. (2021). Starfish: Scalable Pipelines for
Image-Based Transcriptomics. JOSS
6
, 2440.
https://doi.org/10.21105/
joss.02440
.
13. Cisar, C., Keener, N., Ruffalo, M., and Paten, B. (2023). A unified pipeline
for FISH spatial transcriptomics. Cell Genomics
3
, 100384.
https://doi.org/
10.1016/j.xgen.2023.100384
.
14. Van Valen, D.A.V., Kudo, T., Lane, K.M., Macklin, D.N., Quach, N.T.,
DeFelice, M.M., Maayan, I., Tanouchi, Y., Ashley, E.A., and Covert,
M.W. (2016). Deep learning automates the quantitative analysis of individ-
ual cells in live-cell imaging experiments. PLoS Comp. Biol.
12
, e1005177.
https://doi.org/10.1371/journal.pcbi.1005177
.
15. Stringer, C., Wang, T., Michaelos, M., and Pachitariu, M. (2021). Cellpose:
A generalist algorithm for cellular segmentation. Nat. Methods
18
,
100–106.
https://doi.org/10.1038/s41592-020-01018-x
.
16. Greenwald, N.F., Miller, G., Moen, E., Kong, A., Kagel, A., Dougherty, T.,
Fullaway, C.C., McIntosh, B.J., Leow, K.X., Schwartz, M.S., et al. (2022).
Whole-cell segmentation of tissue images with human-level performance
using large-scale data annotation and deep learning. Nat. Biotechnol.
40
,
555–565.
https://doi.org/10.1038/s41587-021-01094-0
.
17. Pachitariu, M., and Stringer, C. (2022). Cellpose 2.0: how to train your own
model. Nat. Methods
19
, 1634–1641.
https://doi.org/10.1038/s41592-
022-01663-4
.
18. Mabaso, M.A., Withey, D.J., and Twala, B. (2018). Spot detection methods
in fluorescence microscopy imaging: A review. Image Anal. Stereol.
37
,
173–190.
https://doi.org/10.5566/ias.1690
.
19. van der Walt, S., Scho
̈
nberger, J.L., Nunez-Iglesias, J., Boulogne, F.,
Warner, J.D., Yager, N., Gouillart, E., and Yu, T.; scikit-image contributors
(2014). scikit-image: image processing in python. PeerJ
2
, e453.
https://
doi.org/10.7717/peerj.453
.
20. Allan, D.B., Caswell, T., Keim, N.C., van der Wel, C.M., and Verweij, R.W.
(2021). soft-matter/trackpy: Trackpy v0.5.0. Zenodo.
https://doi.org/10.
5281/zenodo.4682814
.
21. Gudla, P.R., Nakayama, K., Pegoraro, G., and Misteli, T. (2017).
SpotLearn: Convolutional Neural Network for Detection of Fluorescence
In Situ Hybridization (FISH) Signals in High-Throughput Imaging
Approaches. Cold Spring Harbor Symp. Quant. Biol.
82
, 57–70.
https://
doi.org/10.1101/sqb.2017.82.033761
.
22. Eichenberger, B.T., Zhan, Y., Rempfler, M., Giorgetti, L., and Chao, J.A.
(2021). deepBlink: threshold-independent detection and localization of
diffraction-limited spots. Nucleic Acids Res.
49
, 7292–7297.
https://doi.
org/10.1093/nar/gkab546
.
23. Wollmann, T., and Rohr, K. (2021). Deep Consensus Network: Aggregating
predictions to improve object detection in microscopy images. Med.
Image Anal.
70
, 102019.
https://doi.org/10.1016/j.media.2021.102019
.
ll
Methods in Brief
Cell Systems
15
, 475–482, May 15, 2024
481
24. Ratner, A., Bach, S.H., Ehrenberg, H., Fries, J., Wu, S., and Re
́
, C. (2020).
Snorkel: rapid training data creation with weak supervision. VLDB J.
29
,
709–730.
https://doi.org/10.1007/s00778-019-00552-1
.
25. Moon, T.K. (1996). The expectation-maximization algorithm. IEEE Signal
Process. Mag.
13
, 47–60.
https://doi.org/10.1109/79.543975
.
26. Moffitt, J.R., Hao, J., Wang, G., Chen, K.H., Babcock, H.P., and Zhuang, X.
(2016). High-throughput single-cell gene-expression profiling with multi-
plexed error-robust fluorescence in situ hybridization. Proc. Natl. Acad.
Sci. U SA
113
, 11046–11051.
https://doi.org/10.1073/pnas.1612826113
.
27. Petukhov, V., Xu, R.J., Soldatov, R.A., Cadinu, P., Khodosevich, K.,
Moffitt, J.R., and Kharchenko, P.V. (2022). Cell segmentation in imaging-
based spatial transcriptomics. Nat. Biotechnol.
40
, 345–354.
https://doi.
org/10.1038/s41587-021-01044-w
.
28. Boersma, S., Rabouw, H.H., Bruurs, L.J.M., Pavlovi

c, T., van Vliet, A.L.W.,
Beumer, J., Clevers, H., van Kuppeveld, F.J.M., and Tanenbaum, M.E.
(2020). Translation and replication dynamics of single rna viruses. Cell
183
, 1930–1945.e23.
https://doi.org/10.1016/j.cell.2020.10.019
.
29. Smal, I., Loog, M., Niessen, W., and Meijering, E. (2010). Quantitative com-
parison of spot detection methods in fluorescence microscopy. IEEE
Trans. Med. Imaging
29
, 282–301.
https://doi.org/10.1109/TMI.2009.
2025127
.
30. Ruusuvuori, P., A
̈
ijo
̈
, T., Chowdhury, S., Garmendia-Torres, C., Selinummi,
J., Birbaumer, M., Dudley, A.M., Pelkmans, L., and Yli-Harja, O. (2010).
Evaluation of methods for detection of fluorescence labeled subcellular
objects in microscope images. BMC Bioinformatics
11
, 248.
https://doi.
org/10.1186/1471-2105-11-248
.
31. Hoffman, M.D., Blei, D.M., Wang, C., and Paisley, J. (2013). Stochastic
Variational Inference. Preprint at arXiv
14
, 1303–1347.
https://doi.org/10.
48550/ARXIV.1206.7051
.
32. Gataric, M., Park, J.S., Li, T., Vaskivskyi, V., Svedlund, J., Strell, C.,
Roberts, K., Nilsson, M., Yates, L.R., Bayraktar, O., and Gerstung, M.
(2021). PoSTcode: Probabilistic image-based spatial transcriptomics
decoder. Preprint at bioRxiv.
https://doi.org/10.1101/2021.10.12.464086
.
33. Liu, J., Tran, V., Vemuri, V.N.P., Byrne, A., Borja, M., Kim, Y.J., Agarwal, S.,
Wang, R., Awayan, K., Murti, A., et al. (2023). Concordance of MERFISH
spatial transcriptomics with bulk and single-cell RNA sequencing. Life
Sci. Alliance
6
, e202201701.
https://doi.org/10.26508/lsa.202201701
.
34. Feldman, D., Singh, A., Schmid-Burgk, J.L., Carlson, R.J., Mezger, A.,
Garrity, A.J., Zhang, F., and Blainey, P.C. (2019). Optical pooled screens
in human cells. Cell
179
, 787–799.e17.
https://doi.org/10.1016/j.cell.
2019.09.016
.
35. Zhao, P., Zhu, J., Ma, Y., and Zhou, X. (2022). Modeling zero inflation is not
necessary for spatial transcriptomics. Genome Biol.
23
, 118.
https://doi.
org/10.1186/s13059-022-02684-0
.
36. Palla, G., Spitzer, H., Klein, M., Fischer, D., Schaar, A.C., Kuemmerle, L.B.,
Rybakov, S., Ibarra, I.L., Holmberg, O., Virshup, I., et al. (2022). Squidpy: a
scalable framework for spatial omics analysis. Nat. Methods
19
, 171–178.
https://doi.org/10.1038/s41592-021-01358-2
.
37. Satija, R., Farrell, J.A., Gennert, D., Schier, A.F., and Regev, A. (2015).
Spatial reconstruction of single-cell gene expression data. Nat.
Biotechnol.
33
, 495–502.
https://doi.org/10.1038/nbt.3192
.
38. Beliveau, B.J., Kishi, J.Y., Nir, G., Sasaki, H.M., Saka, S.K., Nguyen, S.C.,
Wu, C.T., and Yin, P. (2018). OligoMiner provides a rapid, flexible environ-
ment for the design of genome-scale oligonucleotide in situ hybridization
probes. Proc Natl Acad Sci USA
115
, E2183–E2192.
https://doi.org/10.
1073/pnas.1714530115
.
39. Lionnet, T., Czaplinski, K., Darzacq, X., Shav-Tal, Y., Wells, A.L., Chao,
J.A., Park, H.Y., de Turris, V., Lopez-Jones, M., and Singer, R.H. (2011).
A transgenic mouse for in vivo detection of endogenous labeled mrna.
Nat. Methods
8
, 165–170.
https://doi.org/10.1038/nmeth.1551
.
40. Thompson, R.E., Larson, D.R., and Webb, W.W. (2002). Precise nano-
meter localization analysis for individual fluorescent probes. Biophys. J.
82
, 2775–2783.
https://doi.org/10.1016/S0006-3495(02)75618-X
.
41. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C.,
Corrado, G.S., Davis, A., Dean, J., Devin, M., et al. (2016). Tensorflow:
Large-scale machine learning on heterogeneous distributed systems.
Preprint at arXiv.
https://doi.org/10.48550/ARXIV.1603.04467
.
ll
Methods in Brief
482
Cell Systems
15
, 475–482, May 15, 2024
STAR
+
METHODS
KEY RESOURCES TABLE
RESOURCE AVAILABILITY
Lead contact
Further information and requests should be directed to and will be fulfilled by the Lead Contact, David Van Valen (
vanvalen@
caltech.edu
).
Materials availability
This study did not generate new materials.
REAGENT or RESOURCE
SOURCE
IDENTIFIER
Chemicals, peptides, and recombinant proteins
Dextran sulfate
Calbiochem
3710-50GM
Saline-sodium citrate
IBI Scientific
IB72010
Formamide
Bio Basic
FB0211
Triton-X
Sigma-Aldrich
10789704001
Ethylene carbonate
Sigma-Aldrich
E26258
Tris-HCl
RPI
T60050
(±)-6-Hydroxy-2,5,7,8-tetramethylchromane-
2-carboxylic acid
Sigma-Aldrich
238813
Catalase
Sigma-Aldrich
C3155
Glucose oxidase
Sigma-Aldrich
G2133
D-glucose
Sigma-Aldrich
G7528
Fibronectin Bovine Protein
Fisher Scientific
33010018
Formaldehyde
Thermo Fisher Scientific
28908
Phorbol 12-myristate 13-acetate
Sigma-Aldrich
P8139
Lipopolysaccharide
Sigma-Aldrich
L4524
Critical commercial assays
Gene Positioning System
Spatial Genomics
Custom order
Deposited data
SpotNet
This paper
https://deepcell.readthedocs.io/en/master/data-gallery
Experimental models: Cell lines
HeLa
ATCC
Cat# CCL-2; RRID: CVCL_0030
THP-1
ATCC
Cat# TIB-202; RRID: CVCL_0006
Oligonucleotides
Full list of ssDNA probes is available in
Table S1
IDT
Custom order
Software and algorithms
Python
https://www.python.org/
N/A
Oligominer package
Beliveau et al.
38
https://github.com/beliveau-lab/OligoMiner
skimage package
van der Walt et al.
19
https://scikit-image.org/
trackpy package
Allan et al.
20
https://pypi.org/project/trackpy/
TensorFlow package
Abadi et al.
41
https://www.tensorflow.org/
Matlab
MathWorks
https://www.mathworks.com
Airlocalize package
Lionnet et al.
39
https://github.com/timotheelionnet/AIRLOCALIZE
Mesmer, DeepCell library
Greenwald et al.
16
https://github.com/vanvalenlab/deepcell-tf/tree/master
Polaris
This paper
https://doi.org/10.5281/zenodo.10823305
ll
Methods in Brief
Cell Systems
15
, 475–482.e1–e6, May 15, 2024
e1