ACM/JMS Journal of Data Science
https://doi.org/10.1145/3648506
.
Research Article
Physics-Informed Neural Operator for Learning Partial
Differential Equations
LIZONGYI
Computing and mathematical science, California Institute of Technology,
Pasadena, USA
ZHENGHONGKAI
Computing and mathematical science, California Institute of
Technology, Pasadena, USA
KOVACHKINIKOLA
Computing and mathematical science, California Institute of
Technology, Pasadena, USA
JINDAVID
Computing and mathematical science, California Institute of Technology,
Pasadena, USA
CHENHAOXUAN
Computing and mathematical science, California Institute of
Technology, Pasadena, USA
LIUBURIGEDE
Computing and mathematical science, California Institute of Technology,
Pasadena, USA
AZIZZADENESHELIKAMYAR
Computing and mathematical science, California
Institute of Technology, Pasadena, USA
ANANDKUMARANIMA
Computing and mathematical science, California Institute of
Technology, Pasadena, USA
.
.
.
Keywords: Neural Operators,
Physics Informed learning, Partial
Differential Equations
.
Physics-Informed Neural Operator for Learning Partial Diferential
Equations
ZONGYI LI
∗
, HONGKAI ZHENG
∗
, NIKOLA KOVACHKI, DAVID JIN, HAOXUAN CHEN,
BURIGEDE LIU, KAMYAR AZIZZADENESHELI, and ANIMA ANANDKUMAR,
Computing
and mathematical science , California Institute of Technology, Pasadena, USA
In this paper, we propose physics-informed neural operators (PINO) that combine training data and physics constraints to
learn the solution operator of a given family of parametric Partial Diferential Equations (PDE). PINO is the irst hybrid
approach incorporating data and PDE constraints at diferent resolutions to learn the operator. Speciically, in PINO, we
combine coarse-resolution training data with PDE constraints imposed at a higher resolution. The resulting PINO model can
accurately approximate the ground-truth solution operator for many popular PDE families and shows no degradation in
accuracy even under zero-shot super-resolution, i.e., being able to predict beyond the resolution of training data. PINO uses
the Fourier neural operator (FNO) framework that is guaranteed to be a universal approximator for any continuous operator
and discretization convergent in the limit of mesh reinement. By adding PDE constraints to FNO at a higher resolution, we
obtain a high-idelity reconstruction of the ground-truth operator. Moreover, PINO succeeds in settings where no training
data is available and only PDE constraints are imposed, while previous approaches, such as the Physics-Informed Neural
Network (PINN), fail due to optimization challenges, e.g., in multi-scale dynamic systems such as Kolmogorov lows.
CCS Concepts: ·
Computing methodologies
→
Machine learning
.
Additional Key Words and Phrases: Neural Operators, Physics Informed learning, Partial Diferential Equations
1 INTRODUCTION
Machine learning methods have recently shown promise in solving partial diferential equations (PDEs) [
1
ś
5
]. A
recent breakthrough is the paradigm of operator learning for solving PDEs. Unlike standard neural networks
that learn using inputs and outputs of ixed dimensions, neural operators learn operators, which are mappings
between function spaces [
1
ś
3
]. The class of neural operators is guaranteed to be a universal approximator for any
continuous operator [
1
] and hence, has the capacity to approximate any operator including any solution operator
of a given family of parametric PDEs. Note that the solution operator is the mapping from the input function
(initial and boundary conditions) to the output solution function. Previous works show that neural operators can
capture complex multi-scale dynamic processes and are signiicantly faster than numerical solvers [6ś12].
Neural operators are proven to be discretization convergent in the limit of mesh reinement [
1
], meaning
they converge to a continuum operator in the limit as the discretization is reined. Consequently, they can be
evaluated at any data discretization or resolution at inference time without the need for retraining. For example,
neural operators such as Fourier neural operator (FNO) can extrapolate to frequencies that are not seen during
∗
Both authors contributed equally to this research.
Authors’ address: Zongyi Li; Hongkai Zheng; Nikola Kovachki; David Jin; Haoxuan Chen; Burigede Liu; Kamyar Azizzadenesheli; Anima
Anandkumar, Computing and mathematical science , California Institute of Technology, Pasadena, USA.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that
copies are not made or distributed for proit or commercial advantage and that copies bear this notice and the full citation on the irst page.
Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy
otherwise, or republish, to post on servers or to redistribute to lists, requires prior speciic permission and/or a fee. Request permissions from
permissions@acm.org.
© 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM 2831-3194/2024/2-ART
https://doi.org/10.1145/3648506
ACM/IMS J. Data Sci.
2 • Li, Zheng, Kovachki, Jin, Chen, Liu, Azizzadenesheli, Anandkumar
Fig. 1. PINO uses both training data and PDE loss function and perfectly extrapolates to unseen frequencies in Kolmogorov
Flows. FNO uses only training data and does not have further information on higher frequencies, but still follows the general
trend of the ground-truth spectrum. On the other hand, using a trained UNet with trilinear interpolation (NN+Interpolation)
has severe distortions at higher frequencies. Details in Section 4.2.
training in Kolmogorov Flows, as shown in Figure 1, while standard approaches such as training a UNet and
adding trilinear interpolation leads to signiicantly worse results at higher resolutions.
Even though FNO follows the general trend of the Kolmogorov low in Figure 1, it cannot perfectly match
it in the regime of super-resolution, i.e., beyond the frequencies seen during training. More generally, neural
operators cannot perfectly approximate the ground-truth operator when only coarse-resolution training data
is provided. This is a fundamental limitation of data-driven operator learning methods which depend on the
availability of training data, which can come either from existing numerical solvers or direct observations of
the physical phenomena. In many scenarios, such data can be expensive to generate, unavailable, or available
only as low-resolution observations [
13
]. This limits the ability of neural operators to learn high-idelity models.
Moreover, the generalization of the learned neural operators to unseen scenarios and conditions that are diferent
from training data is challenging.
1.1 Our Approach and Contributions
In this paper, we remedy the above shortcomings of data-driven operator learning methods through the framework
of physics-informed neural operators (PINO). Here, we take a hybrid approach of combining training data (when
available) with a PDE loss function at a higher resolution. This allows us to approximate the solution operator
of many PDE families nearly perfectly. While there have been many physics-informed approaches proposed
previously (discussed in 1.2), ours is the irst to incorporate PDE constraints at a higher resolution as a remedy
for low resolution training data. We show that this results in high-idelity solution operator approximations.
As shown in Figure 1, PINO extrapolates to unseen frequencies in Kolmogorov Flows. Thus, we show that the
PINO model learned from such multi-resolution hybrid loss functions has almost no degradation in accuracy
ACM/IMS J. Data Sci.
Physics-Informed Neural Operator for Learning Partial Diferential Equations • 3
Fig. 2. PINO trains neural operator with both training data and PDE loss function. The figure shows the neural operator
architecture with the liting point-wise operator that receives input function
㠀
and outputs function
谀
0
with a larger co-
dimension. This operation is followed by
ئج
blocks that compute linear integral operators followed by non-linearity, and the
last layer of which outputs the function
谀
ئج
. The pointwise projection operator projects
谀
ئج
to output function
蠀
. Both
谀
ئج
and
蠀
are functions and all their derivatives (
D
谀
L
,
D
蠀
) can be computed in their exact forms at any query points
鐀
.
even on high-resolution test instances when only low-resolution training data is available. Further, our PINO
approach also overcomes the optimization challenges in approaches such as Physics-Informed Neural Network
(PINN) [14] that are purely based on PDE loss functions and do not use training data, and thus, PINO can solve
more challenging problems such as time-dependent PDEs.
PINO utilizes both the data and equation loss functions (whichever are available) for operator learning. To
further improve accuracy at test time, we ine-tune the learned operator on the given PDE instance using only the
equation loss function. This allows us to provide a nearly-zero error for the given PDE instance at all resolutions.
A schematic of PINO is shown in Figure 2, where the neural operator architecture is based on Fourier neural
operator (FNO) [
2
]. The derivatives needed for the equation loss in PINO are computed explicitly through
the operator layers in function spaces. In particular, we eiciently compute the explicit gradients on function
space through Fourier-space computations. In contrast, previous auto-diferentiation methods must compute the
derivatives at sampling locations.
The PDE loss function added to PINO vastly improves
generalization
and physical validity in operator
learning compared to purely data-driven methods. PINO requires fewer to no training data and generalizes better
compared to the data-driven FNO [
15
], especially on high-resolution test instances. On average, the relative
error is 7% lower on both transient and Kolmogorov lows, while matching the speedup of data-trained FNO
architecture (400x) compared to the GPU-based pseudo-spectral solver [
16
]. Further, the PINO model on the
Navier-Stokes equation can be easily transferred to diferent Reynolds numbers ranging from 100 to 500 using
instance-wise ine-tuning.
We also use PINO for solving
inverse problems
by either: (1) learning the forward solution operator and
using gradient-based optimization to obtain the inverse solution, or (2) learning the inverse solution operator
directly. Imposing the PDE loss guarantees the inverse solution is physically valid in both approaches. We ind
that of these two approaches, the latter is more accurate for recovering the coeicient function in Darcy low. We
show this approach is 3000x faster than the conventional solvers using accelerated Markov Chain Monte-Carlo
(MCMC) [17].
ACM/IMS J. Data Sci.
4 • Li, Zheng, Kovachki, Jin, Chen, Liu, Azizzadenesheli, Anandkumar
1.2 Related Work
Learning solution to PDEs has been proposed under two paradigms: (i) data-driven learning and (ii) physics-
informed optimization. The former utilizes data from existing solvers or experiments, while the latter is purely
based on PDE constraints. An example of data-driven methods is the class of neural operators for learning
the solution operator of a given family of parametric PDEs. An example of the physics-based approach is the
Physics-Informed Neural Network (PINN) for optimizing the PDE constraints to obtain the solution function of a
given PDE instance. Both these approaches have shortcomings. Neural operators require data, and when that is
limited or not available, they are unable to learn the solution operator faithfully. PINN, on the other hand, does
not require data but is prone to failure, especially on multi-scale dynamic systems due to optimization challenges.
Previous work by [
18
] has shown promise in learning discretized solution map with variational loss. In this work
we generalize it to operator learning.
Neural operators learn the solution operator of a family of PDEs, deined by the map from the inputśinitial
conditions and boundary conditions, to the outputśsolution functions. In this case, usually, a dataset of input-
output pairs from an existing solver or real-world observation is given. There are two main aspects to consider (a)
models: the design of models for learning highly complicated PDE solution operators, and (b) data: minimizing data
requirements and improving generalization. Recent advances in operator learning replace traditional convolutional
neural networks and U-Nets from computer vision with operator-based models tailored to PDEs with greatly
improved model expressiveness [
4
,
15
,
19
ś
21
]. Speciically, the neural operator generalizes the neural network to
the operator setting where the input and output spaces are ininite-dimensional. The framework has successfully
approximated solution operators for highly non-linear problems such as turbulence low [
2
,
3
]. However, the
data challenges remain. In particular, (1) training data from an existing solver or an experimental setup is costly
to obtain, (2) models struggle in generalizing away from the training distribution, and (3) constructing the most
eicient approximation for given data remains elusive. Moreover, it is also evident that in many real-world
applications, observational data often is available at only low resolutions [
13
], limiting the model’s ability to
generalize.
Alternatives to data-driven approaches for solving PDEs are physics-based approaches that require no training
data. A popular framework known as Physics-Informed Neural Network (PINN) [
14
] uses optimization to
ind the solution function of a given PDE instance. PINN uses a neural network as the ansatz of the solution
function and optimizes a loss function to minimize the violation of the given equation by taking advantage of
auto-diferentiation to compute the exact derivatives. PINN overcomes the need to choose a discretization grid
that most numerical solvers require, e.g., inite diference methods (FDM) and inite element methods (FEM). It
has shown promise in solving PDEs for a wide range of applications, including higher dimensional problems.
[
22
ś
25
]. Recently, researchers have developed many variations of PINN with promising results for solving inverse
problems and partially observed tasks [26ś28].
However, PINN fails in many multi-scale dynamic PDE systems [
29
ś
31
] due to two main reasons, viz., (1) the
challenging optimization landscape of the PDE constraints [
32
] and its sensitivity to hyper-parameter selection
[
33
], and (2) the diiculty in propagating information from the initial or boundary conditions to unseen parts of
the interior or to future times [
34
]. Moreover, PINN only learns the solution function of a single PDE instance and
cannot generalize to other instances without re-optimization. Previous work on physics-informed DeepONet that
imposes PDE losses on DeepONet [
35
] overcomes this limitation and can learn across multiple PDE instances.
While the PDE loss is computed at any query points, the input is limited to a ixed grid in standard DeepONet [
1
],
and its architecture is a linear method of approximation [
36
]. Our work overcomes all previously mentioned
limitations. Further, a unique feature that PINO enjoys over other hybrid learning methods [
27
,
37
,
38
] is its
ability to incorporate data and PDE loss functions at diferent resolutions. This has not been attempted before,
ACM/IMS J. Data Sci.
Physics-Informed Neural Operator for Learning Partial Diferential Equations • 5
Fig. 3. solve for one specific instance verse learn the entire solution operator Let: numerical solvers and PINNs focus on
solving one specific instance. Right: neural operators learn the solution operator for a family of equations.
and none of the previous works focus on extrapolation to higher resolutions, beyond what is seen in training
data.
2 PRELIMINARIES AND PROBLEM SETTINGS
In this section, we irst deine the stationary and dynamic PDE systems that we consider. We give an overview of
the physics-informed setting and operator-learning setting. In the end, we deine the Fourier neural operator as a
speciic model for operator learning.
2.1 Problem setings
We consider two natural classes of PDEs. In the irst, we consider the stationary system
P(
蠀,㠀
)
=
0
,
in
�
⊂
R
䐀
蠀
=
倀,
in
吀�
(1)
where
�
is a bounded domain,
㠀
∈ A ⊆ V
is a PDE coeicient/parameter,
蠀
∈ U
is the unknown, and
P
:
U × A → F
is a possibly non-linear partial diferential operator with
(U
,
V
,
F)
a triplet of Banach
spaces. Usually, the function
倀
is a ixed boundary condition but can also potentially enter as a parameter. This
formulation gives rise to the solution operator
G
†
:
A → U
deined by
㠀
↦→
蠀
. A prototypical example is the
second-order elliptic equation
P(
蠀,㠀
)
=
−∇· (
㠀
∇
蠀
) +
䰀
.
In the second setting, we consider the dynamical system
䐀蠀
䐀萀
=
R(
蠀
)
,
in
�
× (
0
,
∞)
蠀
=
倀,
in
吀�
× (
0
,
∞)
蠀
=
㠀
in
̄
�
×{
0
}
(2)
where
㠀
=
蠀
(
0
) ∈ A ⊆ V
is the initial condition,
蠀
(
萀
) ∈ U
for
萀
>
0 is the unknown, and
R
is a possibly
non-linear partial diferential operator with
U
, and
V
Banach spaces. As before, we take
倀
to be a known
boundary condition. We assume that
蠀
exists and is bounded for all time and for every
蠀
0
∈ U
. This formulation
gives rise to the solution operator
G
†
:
A →
�