of 17
Research Article
Vol. 15, No. 10 /1 Oct 2024 /
Biomedical Optics Express
5739
Efficient, gigapixel-scale, aberration-free whole
slide scanner using angular ptychographic
imaging with closed-form solution
S
HI
Z
HAO
,
H
AOWEN
Z
HOU
,
S
IYU
(S
TEVEN
) L
IN
,
R
UIZHI
C
AO
,
AND
C
HANGHUEI
Y
ANG
*
Department of Electrical Engineering, California Institute of Technology, Pasadena, California 91125, USA
These authors contributed equally to this work
*
chyang@caltech.edu
Abstract:
Whole slide imaging provides a wide field-of-view (FOV) across cross-sections of
biopsy or surgery samples, significantly facilitating pathological analysis and clinical diagnosis.
Such high-quality images that enable detailed visualization of cellular and tissue structures are
essential for effective patient care and treatment planning. To obtain such high-quality images
for pathology applications, there is a need for scanners with high spatial bandwidth products,
free from aberrations, and without the requirement for z-scanning. Here we report a whole slide
imaging system based on angular ptychographic imaging with a closed-form solution (WSI-
APIC), which offers efficient, tens-of-gigapixels, large-FOV, aberration-free imaging. WSI-APIC
utilizes oblique incoherent illumination for initial high-level segmentation, thereby bypassing
unnecessary scanning of the background regions and enhancing image acquisition efficiency. A
GPU-accelerated APIC algorithm analytically reconstructs phase images with effective digital
aberration corrections and improved optical resolutions. Moreover, an auto-stitching technique
based on scale-invariant feature transform ensures the seamless concatenation of whole slide
phase images. In our experiment, WSI-APIC achieved an optical resolution of 772 nm using a
10
×
/0.25 NA objective lens and captures 80-gigapixel aberration-free phase images for a standard
76.2 mm
×
25.4 mm microscopic slide.
© 2024 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement
1. Introduction
Whole slide imaging (WSI) digitizes tissue sections on microscope slides into high-resolution
images that can be stored, analyzed, and shared, thereby advancing the shift from traditional
pathology to digital diagnosis [1,2]. Recently, the advent of digital image analysis using deep
learning has transformed digital pathology in numerous clinical and biological applications [3–5].
WSI plays a significant role in revealing both local and global spatial tissue and cell interactions
[6,7], thereby providing crucial insights for clinical diagnosis.
These data-centric techniques heavily rely on the quality of digital images and the availability
of a large amount of data [2,8]. Current WSI platforms face a significant technical challenge
in meeting the growing demand for high-quality and high-resolution images within the digital
pathology community. Aberrations, including defocus, can cause image blur or distortion,
resulting in information loss or misrepresentations in downstream analysis [9,10]. Potential
solutions include the use of z-scanning and autofocusing systems to address defocus aberrations
and reduce information loss during imaging [11–14]. However, no autofocusing metric can
be robustly generalized to all types of samples [15]. Additionally, other aberration terms of
the optical system, such as field-dependent aberrations, cannot be corrected efficiently and
conveniently.
Aside from these technical challenges, the use of deep learning for digital pathology analysis
have also increase the demand for large WSI data sets, so that the deep learning systems can
#538148
https://doi.org/10.1364/BOE.538148
Journal © 2024
Received 30 Jul 2024; revised 27 Aug 2024; accepted 27 Aug 2024; published 6 Sep 2024
Research Article
Vol. 15, No. 10 /1 Oct 2024 /
Biomedical Optics Express
5740
learn to read through the variations associated with individual slides. High quality and consistent
WSI imaging technology can provide an alternative pathway to address this large data demand
– high consistency in WSI images can cut down on extraneous variations and allow the deep
learning model to focus and train on truly relevant pathological features with much smaller
data sets. In the WSI pipeline, from sample preparations to image processing, the histological
staining procedure introduces the most variation, leading to significant discrepancies in digital
images [16]. To date, no staining protocol can effectively addresse these stain variations. A
promising approach to overcome this issue is to eliminate the staining process from the pipeline
and provide the structure and morphology of the samples with phase images. This is because
cellular structures are generally associated with refractive index distributions [17] that are directly
captured in phase images.
To achieve high-quality and high-resolution complex field imaging and address the aforemen-
tioned challenges, several advanced techniques can be considered, including digital holographic
microscopy [18–20], transport-of-intensity [21–24], coherent diffractive imaging [25,26], and
ptychographic imaging techniques [27–30]. These techniques are capable of providing amplitude-
phase information, which reflects changes in absorption and refractive index in tissue sections.
However, digital holographic microscopy necessitates a reference beam for interferometry and
requires prior knowledge of the system’s aberration for corrections [19]. Transport-of-intensity
methods use multiple images near the focal plane along the axial direction to retrieve phase
information. Unfortunately, the reliance on specific focal plane positions and boundary condition
limits its applicability in current WSI systems. Coherent diffractive imaging recovers phase
images from several axial diffraction patterns obtained from a coherent source; however, it suffers
from a reduction in optical resolution by a factor of two compared to conventional brightfield
microscopy.
Ptychographic imaging holds promise for digital pathology applications due to its ability to
perform high-resolution imaging by taking measurements with lateral overlaps and employing
numerical optimizations to compensate for aberrations and retrieve phase information. A compact
and efficient ptychographic imaging modality has been proposed and developed with phase
imaging and aberration correction capabilities [31] Another promising technique is Fourier
ptychographic microscopy (FPM) [32–35]. FPM utilizes computational methods to achieve
high spatial-bandwidth-product imaging. It collects a series of low-resolution images under
oblique illuminations and uses an iterative algorithm to reconstruct high-resolution images
while maintaining a large field-of-view (FOV) with a small numerical aperture (NA) objective
lens. This technique can also determine and numerically correct aberrations, thereby directly
addressing defocus and system aberrations computationally [36]. However, both ptychographic
imaging and FPM are prone to failure under severe system aberrations, which often occur at the
peripheral regions of each FOV in WSI [37] Moreover, the iterative reconstruction algorithm
in FPM suffers from challenges related to parameter tuning due to its nonconvex optimization
nature, making it less robust for large-scale automated reconstructions in distinctive pathological
applications.
Angular ptychographic imaging with a closed-form method (APIC) addresses robustness
concerns in current ptychographic imaging reconstruction [38]. APIC utilizes the Kramers–Kronig
relations for complex field reconstruction [39–41] and employs analytical techniques to correct
aberrations while extending high spatial frequency content through darkfield measurements.
Using NA-matching and darkfield measurements, APIC achieves high-resolution, aberration-free
complex field retrieval with a low-magnification and large-FOV objective. This approach
demonstrates exceptional robustness against aberration correction.
In this study, we introduced WSI-APIC (whole slide imaging based on APIC), a whole
slide scanner that provides efficient, large-FOV, aberration-free, complex field imaging at the
gigapixel-scale. This study aimed to adapt the implementation of WSI-APIC for digital pathology
Research Article
Vol. 15, No. 10 /1 Oct 2024 /
Biomedical Optics Express
5741
Fig. 1.
General pipeline of the WSI-APIC system. (a) The sample-locating system auto-
segments sample areas and generates a scanning mask for each slide. (b) The WSI-APIC
microscope, equipped with customized LED illuminations, scans the sample according to
the scanning mask and numerically reconstructs high-resolution, aberration-free images
of the complex field. (c) An auto-stitching system integrates the reconstructed images to
produce the whole slide images. NA stands for numerical aperture.
WSI applications through engineering design optimizations and algorithmic accelerations. The
following sections detail our system pipeline and the numerical algorithms employed. Section 3
presents the validation and evaluation of our technique through experiments and analyses. Finally,
we summarize the performance of our system and discuss potential improvements for future
high-throughput digital pathology applications.
2. Methods
A general pipeline of the WSI-APIC system is illustrated in Fig. 1. The system operates in
three steps. (a) A sample-locating system employs oblique incoherent illumination from a light
emitting diode (LED) lamp to perform initial high-level auto-segmentation of the sample area.
This process generates a scanning mask that helps avoid unnecessary scanning of background
Research Article
Vol. 15, No. 10 /1 Oct 2024 /
Biomedical Optics Express
5742
regions. (b) The WSI-APIC system scans the sample areas of the slide according to the predefined
scanning mask. For each FOV, a series of images is captured under different illumination angles
provided by LEDs. Reconstructions of these images is then performed using a GPU-accelerated
APIC algorithm. (c) An auto-stitching system automatically combines the reconstructed images
to produce a whole slide image.
2.1. Sample-locating system
In our study, the sample-locating system comprised an LED lamp, a sample holder, and a lens-
camera imaging component. This configuration enabled the pre-identification of approximate
sample locations, facilitating smart scanning during whole slide imaging. In this study, we
implemented a simple lens imaging system with oblique incoherent illumination from the LED
lamp, as shown in Fig. 1(a1). This setup generated a whole slide brightfield image, clearly
depicting the profile of the sample. It can be adapted to both stained and unstained histological
samples. For stained samples, the natural absorption of tissue areas provided image contrast to
distinguish between the background glass slide and sample content. For unstained samples, the
slight refractive index differences between tissue sections and the immersion medium resulted
in scattered light under oblique illumination. This approach, similar to oblique illumination
microscopy [42,43], provided image contrast that aided in differentiating the sample from the
background.
For our study, an edge-detection algorithm based on peak identification was initially applied to
identify the boundaries of the coverslip and slide. This was followed by cropping to eliminate
regions outside the coverslip, ensuring that the segmentation was focused solely on the areas
of interest within the slide (Fig. 1(a2)). Furthermore, this process ensures that the top-right
corner of the cropped image, as well as the following sample mask, aligns precisely with the
top-right corner of the coverslip, thereby establishing a reliable reference point for our subsequent
scanning. Subsequently, the cropped image, which conformed to the boundaries of the coverslip,
was processed using a pretrained segment anything model (SAM) [44] with zero-shot learning
capabilities. Although SAM had not been specifically trained on microscopic slide images,
it generated several moderately accurate masks. The imperfectness of these masks might be
attributed to spatial intensity variations within tissue structures. To refine the precise sample
mask, we further developed a mask selection algorithm. First, we filtered out masks with low
accuracy ratings and those covering small segmented areas. Given the prior knowledge that
intensity variations within the sample were relatively small compared to the sample-background
interface, we calculated the mean and variance of intensities for all SAM-generated masks.
Based on these metrics, we automatically identified and selected the masks corresponding to the
sample region, merging them into a single mask representing the entire sample area. Since this
process involves no training, only inference, it takes less than 20 seconds for an input size of
1885
×
944 pixels, which is negligible compared to the time required for subsequent scanning
and reconstruction.
The processed sample mask was prepared for the scanner. Based on the FOV of our camera, we
calculated the numbers of scans required to cover the entire sample area in the lateral directions.
Subsequently, the sample mask was downsampled to create a low-resolution scanning mask that
aligns with the number of scanning positions needed (Fig. 1(a4)). This scanning mask avoided
excessive scanning during microscopic imaging, thereby optimizing the time required for image
acquisition and processing.
2.2. WSI-APIC setup and forward model
In our study, the WSI-APIC facilitated the analytical reconstruction of complex fields and enabled
robust aberration estimation and correction with a simple setup and minimal modifications to
a conventional brightfield microscope. As illustrated in Fig. 1(b), a programmable LED disk
Research Article
Vol. 15, No. 10 /1 Oct 2024 /
Biomedical Optics Express
5743
was positioned in front of the sample to provide quasi-plane-wave illumination from various
oblique angles. The sample was then imaged using a standard 4
f
microscopy system including
an objective lens, a tube lens, and a camera. The WSI-APIC system employed two types of LEDs
illumination. First, LEDs with illumination angles matching the NA of the objective lens were
sequentially activated to produce NA-matching measurements. Second, LEDs with illumination
angles exceeding the objective lens’s receiving angle were lit for darkfield measurements, which
required a longer exposure time. All the measurements were subsequently used for reconstruction.
To effectively demonstrate the imaging system, we modeled the forward process from
illumination to image acquisition. A 2D thin sample was illuminated by a plane wave emitted by
the
i
th
LED (
i
=
1, 2,
. . .
,
n
), corresponding to a transverse k-vector
k
i
. The modulated sample
spectrum
ˆ
S
i
at the camera plane is given by
ˆ
S
i
(
k
)
=
ˆ
O
(
k
k
i
)
H
(
k
)
=
ˆ
O
(
k
k
i
)
Circ
NA
(
k
)
e
j
φ
(
k
)
,
(1)
where
ˆ
O
represents the sample’s spectrum,
k
denotes the transverse spatial frequency coordinates,
and
H
is the coherent transfer function (CTF) of the imaging system. The CTF is a circular
function
Circ
NA
with an NA-dependent radius and a phase term
φ
depicting system aberrations.
Due to the Fourier transform property, changing the illumination angles shifts the CTF laterally,
sampling different regions of the sample’s spectrum. Finally, the
i
th
intensity measurement
I
i
(
x
)
obtained from the camera is,
I
i
(
x
)
=
|[F
1
(
ˆ
S
i
)](
x
)|
2
=
|
S
i
(
x
)|
2
.
(2)
where
F
1
denotes the inverse Fourier transform operator.
2.3. WSI-APIC reconstruction
The graphical schematic of the WSI-APIC reconstruction algorithm is shown in Fig. 2. The
algorithm comprises two main steps: brightfield reconstruction using Kramers–Kronig relations
[39–41], incorporating aberration correction for NA-matching measurements and spectrum
extension using darkfield measurements. Aberrations in the system were estimated during the
brightfield reconstruction phase and corrected for both NA-matching and darkfield spectra.
Based on the forward model of the WSI-APIC system, the spectrum of the sample under
a given NA-matching illumination angle at the camera plane was constrained by a circular
function support, with the zero-frequency (DC) component located at the edge of this support.
To facilitate the application of the directional Hilbert transform in brightfield reconstruction, it
was advantageous to shift the DC component to the center of the spatial frequency plane. This
shift simplified the reconstruction process.
According to the properties of the Fourier transform, shifting by
k
i
in Fourier space is
equivalent to multiplying by a phase shift of
e
2
π
k
i
·
x
in the spatial domain. We introduced an
important relation between the sample’s spectrum at the camera plane
ˆ
S
i
(
k
)
and the shifted
spectrum
ˆ
S
i
(
k
)
=
ˆ
S
i
(
k
+
k
i
)
:
|F
1
(
ˆ
S
i
(
k
))|
2
=
|F
1
(
ˆ
S
i
(
k
+
k
i
))|
2
=
|
S
i
(
x
)
e
2
π
k
i
·
x
|
2
=
|
S
i
(
x
)|
2
=
I
i
(
x
)
.
(3)
The intensity measurement
I
i
(
x
)
is invariant to the addition of phase ramp. Hence, reconstruct-
ing
S
i
(
x
)
=
F
1
(
ˆ
S
i
(
k
))
or
S
i
(
x
)
=
F
1
(
ˆ
S
i
(
k
))
is equivalent when using
I
i
(
x
)
.
The first step in the WSI-APIC reconstruction process involved applying the Kramers–Kronig
relations, which analytically linked the real and imaginary parts of a complex function. These
relations could be used to reconstruct the complex field using measurements obtained under
illumination angles that exactly match the maximal acceptance angle of the objective lens
Research Article
Vol. 15, No. 10 /1 Oct 2024 /
Biomedical Optics Express
5744
Fig. 2.
Reconstruction pipeline of the WSI-APIC system. It contains two major steps:
brightfield reconstruction and darkfield reconstruction. In the brightfield reconstruction
process, the Kramers–Kronig relation was employed to recover the corresponding spectra.
Then, phase differences between overlapping spectra were then utilized to identify and
correct the aberrations of the imaging system. These reconstructed spectra were then
corrected for aberrations and stitched together. During the darkfield reconstruction part, the
known spectrum from the
i
th
darkfield measurement was used to isolate the cross-correlation
component from other autocorrelation terms. By solving a linear equation involving the
isolated cross-correlation, the unknown spectrum was analytically determined. We applied
GPU-acceleration to original WSI-APIC reconstruction algorithm. The green arrows indicate
GPU-accelerated processes.
(NA-matching measurements). We constructed an auxiliary function by taking the logarithm of
the shifted complex field
S
i
(
x
)
in a point-wise manner and adding a constant phase term:
log
[
S
i
(
x
)
e
j
θ
i
]
=
log
[|
S
i
(
x
)|]
+
j
{
arg
[
S
i
(
x
)]−
θ
i
}
,
(4)
where
θ
i
=
φ
(
k
i
)
is a constant phase offset defined by the pupil phase at
k
=
k
i
. As the intensity
I
i
(
x
)
is measured, the real part of the auxiliary function is simple to obtain. The imaginary part is
computed using the directional Hilbert transform in conjunction with the intensity measurements.
G
(
k
)·[F(
log
[
S
i
(
x
)
e
j
θ
i
])](
k
)
=
⎪⎨
[F(
log
I
i
)](
k
)
,
k
·
k
i
<
0
0.5
[F(
log
I
i
)](
k
)
,
k
·
k
i
=
0
0,
k
·
k
i
>
0
(5)
where
F
denotes the Fourier transform operator. Then, the desired field with a constant phase
offset can be restored using the inverse Fourier transform and exponential function.
S
i
(
x
)
e
j
θ
i
=
exp
[F
1
(
G
(
k
))]
.
(6)
The reconstructed complex field from the WSI-APIC system includes aberration terms. To
handle these aberrations, we worked in the spatial frequency domain and utilized the phases of
multiple reconstructed spectra. The spectrum of the constructed complex field is given by:
ˆ
S
i
(
k
)
e
j
θ
i
=
F(
S
i
(
x
)
e
j
θ
i
)
=
ˆ
A
(
k
)
e
j
ˆ
α
(
k
)
Circ
NA
(
k
+
k
i
)
e
j
φ
(
k
+
k
i
)−
j
θ
i
,
(7)