of 30
U-FNO - an enhanced Fourier neural operator
based-deep learning model for multiphase flow
Gege Wen
a
, Zongyi Li
b
, Kamyar Azizzadenesheli
c
, Anima Anandkumar
b
,
Sally M. Benson
a
a
Energy Resources Engineering, Stanford University, 367 Panama
St, Stanford, 94305, CA, USA
b
Computing and Mathematical Sciences, California Institute of Technology, 1200 E.
California Blvd., MC 305-16, Pasadena, 91125, CA, USA
c
Department of Computer Science, Purdue University, 305 N University St, West
Lafayette, 47907, IN, USA
Abstract
Numerical simulation of multiphase flow in porous media is essential for
many geoscience applications. Data-driven machine learning methods pro-
vide faster alternatives to traditional simulators by training neural network
models with numerical simulation data mappings. Here we present U-FNO, a
novel neural network architecture for solving multiphase flow problems with
superior speed, accuracy, and data efficiency. U-FNO is designed based on
the newly proposed Fourier neural operator (FNO) that learns an infinite
dimensional integral kernel in the Fourier space, which has shown excellent
performance for single-phase flows. Here we extend the FNO-based archi-
tecture to a CO
2
-water multiphase problem, and proposes the U-FNO ar-
chitecture to enhance the prediction accuracy in multiphase flow systems.
Through a systematic comparison among a CNN benchmark and three types
of FNO variations, we show that the U-FNO architecture has the advantages
of both the traditional CNN and original FNO, providing significantly more
accurate and efficient performance than previous architectures. The trained
U-FNO predicts gas saturation and pressure buildup with a 6
×
10
4
times
speed-up compared to traditional numerical simulators while maintaining
similar accuracy. The trained models can act as a general-purpose simulator
alternative for 2D-radial CO
2
injection problems with wide ranges of perme-
ability and porosity heterogeneity, anisotropy, reservoir conditions, injection
configurations, flow rates, and multiphase flow properties.
Preprint submitted to Elsevier
September 21, 2021
arXiv:2109.03697v2 [physics.geo-ph] 17 Sep 2021
Keywords:
Multiphase flow, Fourier neural operator, Convolutional neural
network, Carbon capture and storage, Deep learning
1. Introduction
Multiphase flow in porous media is important for many geoscience ap-
plications, including contaminant transport [1], carbon capture and storage
(CCS) [2], hydrogen storage [3], oil and gas extraction [4], and nuclear waste
storage [5]. Due to the multi-physics, non-linear, and multi-scale nature of
these processes, numerical simulation is the primary approach used to solve
mass and energy conservation equations for these applications [6]. These
numerical simulations are often very time consuming and computationally
intensive since they require fine spatial and temporal discretization to accu-
rately capture the flow processes [7, 8]. Meanwhile, the inherent uncertainty
in property distributions of heterogeneous porous medium necessitates prob-
abilistic assessments and inverse modeling to aid engineering decisions [9, 10].
Both of these procedures require large numbers of forward numerical simu-
lation runs and are often prohibitively expensive [11].
A number of machine learning-based methods have been proposed over
the past few years to provide faster alternatives to numerical simulation.
Most existing machine learning-based methods can be categorized into the
following two categories: (1) data-driven finite-dimensional operators that
learn Euclidean space mappings from numerical simulation data [12, 13, 14,
15, 16, 17], and (2) physics-informed/ physics-constrained/neural finite dif-
ference learning methods that parameterize the solution functions with a neu-
ral network [18, 19, 20]. The first type, finite-dimensional operators, is often
implemented with convolutional neural networks (CNN). These CNN-based
models have been successful in providing fast and accurate predictions for
high-dimensional and complex multiphase flow problems [16, 21, 17, 22, 23].
However, CNN-based methods are prone to overfitting, therefore requiring
large numerical simulation data sets that can be unmanageable as the prob-
lem dimension grows. Also, the results produced by these models are tied
to the specific spatial and temporal meshes used in the numerical simulation
data set. The second approach using neural finite difference methods requires
separate trainings for any new instance of the parameters or coefficients [18]
(
e.g.
, new permeability map or injection rate). Therefore, these methods
require as much computational effort as traditional numerical solvers, if not
2
more. Furthermore, these methods often struggle with multiphase flow prob-
lems with shock fronts, which are common for many classes of geoscience
problems [24, 25].
Recently, a novel approach, the
neural operator
, has been proposed that
directly learns the infinite-dimensional-mapping from any functional para-
metric dependence to the solution [26, 27, 28, 29]. Unlike neural finite dif-
ference methods, neural operators are data-driven therefore require training
only once. Meanwhile, neural operators are mesh-independent, so they can be
trained and evaluated on different grids. Due to the cost of evaluating global
neural integral operators, previously proposed neural operators have not yet
been able to achieve the desirable degree of computational efficiency [30].
However, one type of neural operator, the Fourier neural operator (FNO),
alleviates this issue through the implementation of a Fast Fourier Trans-
form [30]. The FNO has shown excellent performance on single-phase flow
problems with great generalization ability, and is significantly more data ef-
ficient than CNN-based methods [30].
Here we extend the FNO-based architecture to multiphase flow prob-
lems. We find that while FNO’s testing accuracy is generally higher than
CNN-based models, the training accuracy is sometimes lower due to the reg-
ularization effect of the FNO architecture. To improve upon this, we present
an enhanced Fourier neural operator, named U-FNO, that combines the ad-
vantages of FNO-based and CNN-based models to provide results that are
both highly accurate and data efficient. Through the implementation of the
newly proposed U-Fourier layer, we show that the U-FNO model architec-
ture produces superior performance over both the original FNO [30] and a
state-of-the-art CNN benchmark [17]. We apply the U-FNO architecture to
the highly complex CO
2
-and-water multiphase flow problem in the context
of CO
2
geological storage to predict dynamic pressure buildup and gas sat-
uration. The trained U-FNO models provide an alternative to numerical
simulation for 2D-radial CO
2
injection problems with wide ranges of perme-
ability and porosity heterogeneity, anisotropy, reservoir conditions, injection
configurations, flow rates, and multiphase flow properties.
2. Methods
The goal of a neural operator is to learn an infinite-dimensional-space
mapping from a finite collection of input-output observations. To formulate
the problem, we define the domain
D
R
d
be a bounded and open set;
3
A
be the input function space;
Z
be the output function space.
A
and
Z
are separable Banach spaces of functions defined on
D
that takes values in
R
d
a
and
R
d
z
respectively.
G
:
A → Z
is a non-linear map that satisfy
the governing PDEs. Suppose we have
a
j
that are drawn from probability
measure
μ
in
A
, then
z
j
=
G
(
a
j
). We aim to build an operator
G
θ
that
learns an approximation of
G
by minimizing the following problem using a
cost function
C
.
min
θ
E
a
μ
[
C
(
G
θ
(
a
)
,
G
(
a
))]
(1)
Since
a
j
∈ A
and
z
j
∈ Z
are both functions, we use
n
-point discretiza-
tion
D
j
=
{
x
1
,...,x
n
} ⊂
D
to numerically represent
a
(
x
)
j
|
D
j
R
n
×
d
a
and
z
(
x
)
j
|
D
j
R
n
×
d
z
. We demonstrate in this section that the proposed U-
FNO architecture learns the infinite-dimensional-space mapping
G
θ
from a
finite collections of
a
(
x
)
j
and
z
(
x
)
j
pairs. A table of notation is included in
Appendix A.
2.1. U-FNO architecture
The U-FNO architecture contains the following three steps:
1. Lift input observation
a
(
x
) to a higher dimensional space
v
l
0
(
x
) =
P
(
a
(
x
)) through a fully connected neural network transformation
P
.
2. Apply iterative Fourier layers followed by iterative U-Fourier layers:
v
l
0
7→
...
7→
v
l
L
7→
v
m
0
7→
...
7→
v
m
M
where
v
l
j
for
j
= 0
,
1
,...,L
and
v
m
k
for
k
= 0
,
1
,...,M
1 are sequences of functions taking values in
R
c
for channel dimension
c
.
3. Project
v
m
M
back to the original space
z
(
x
) =
Q
(
v
m
M
(
x
)) using a fully
connected neural network transformation
Q
.
Figure 1A provides a schematic of the U-FNO architecture. Within each
newly proposed U-Fourier layer, we have
v
m
k
+1
(
x
) :=
σ
(
(
K
v
m
k
)
(
x
) +
(
U
v
m
k
)
(
x
) +
W
(
v
m
k
(
x
))
)
,
x
D
(2)
where
K
is a kernel integral transformation parameterized by a neural net-
work,
U
is a U-Net CNN operator, and
W
is a linear operator, which are all
learnable.
σ
is a non-linear activation function. Refer to Li et al. [30] for the
formulation of the original Fourier layer.
4
Figure 1: A. U-FNO model architecture.
a
(
x
) is the input,
P
and
Q
are fully connected
neural networks, and
z
(
x
) is the output. B. Inside the Fourier layer,
F
denotes the Fourier
transform,
R
is the parameterization in Fourier space,
F
1
is the inverse Fourier transform,
W
is a linear bias term, and
σ
is the activation function. C. Inside the U-FNO layer,
U
denotes a two step U-Net, the other notations have identical meaning as in the Fourier
layer.
2.2. Integral kernel operator in the Fourier space
The integral kernel operator in Equation 2 is defined by
(
K
(
v
l
)
)
(
x
) =
D
κ
(
x,y
)
v
l
(
y
)d
v
l
(
y
)
,
x
D.
(3)
To efficiently parameterize kernel
κ
, the FNO method considers the rep-
resentation
v
l
(and also
v
m
) in the Fourier space and utilizes Fast Fourier
Transform (FFT) [30]. By letting
κ
(
x,y
) =
κ
(
x
y
) in Equation 3 and
applying the convolution theorem, we can obtain
(
K
(
v
l
)
)
(
x
) =
F
1
(
F
(
κ
)
·F
(
v
l
)
)
(
x
)
,
x
D
(4)
where
F
denotes a Fourier transform of a function
f
:
D
R
c
and
F
1
is
its inverse. Now, we can parameterize
κ
directly by its Fourier coefficients:
(
K
(
v
l
)
)
(
x
) =
F
1
(
R
·F
(
v
l
)
)
(
x
)
,
x
D.
(5)
5
where
R
is the Fourier transform of a periodic function
κ
. Since we assume
that
κ
is periodic, we can apply a Fourier series expansion and work in the
discrete modes of Fourier transform.
We first truncate the Fourier series at a maximum number of modes
k
max
,
and then parameterize
R
directly as a complex valued (
k
max
×
c
×
c
)-tensor
with the truncated Fourier coefficients. As a result, multiplication by the
learnable weight tensor
R
is
(
R
·F
(
v
l
)
)
k,i
=
c
j
=1
R
k,i,j
(
F
(
v
l
))
k,j
,
k
= 1
,...,k
max
, i
= 1
,...,c.
(6)
By replacing the
F
by the FFT and implementing
R
using a direct linear
parameterization, we have obtained the Fourier operator as illustrated in
Figure 1B and C with nearly linear complexity.
2.3. Characteristics of the U-Fourier layer
In contrast to the original Fourier layer in FNO [30], the U-FNO architec-
ture proposed here appends a U-Net path in each U-Fourier layer. The U-Net
processes local convolution to enrich the representation power of the U-FNO
in higher frequencies information. The number of Fourier and U-Fourier lay-
ers,
L
and
M
, are hyperparameters that can be optimized for the specific
problem. For the multi-phase flow problem considered here, we found that
the architecture with half Fourier layers and half U-Fourier layers achieves
the best performance, compared to architectures with all Fourier layers or
all U-Fourier layers.
Note that although the Fourier neural operator is an infinite-dimensional-
operator, when we append the U-Net block, we sacrifice the flexibility of
training and testing the model at different discretizations. We made this
choice because the CO
2
-water multiphase flow problem is very sensitive to
numerical dispersion and numerical dissolution, which are both tied to a spe-
cific grid resolution. When training and testing at different grid dimensions,
the numerical noise is often transformed in a nonphysical way. As a result,
for this problem, we prioritize achieving higher training and testing accuracy,
which the U-FNO provides.
6
3. Problem setting
3.1. Governing equation
We consider a multi-phase flow problem with CO
2
and water in the con-
text of geological storage of CO
2
. The CO
2
and water are immiscible but have
mutual solubility. The general forms of mass accumulations for component
η
=
CO
2
or
water
are written as [31]:
(
φ
p
S
p
ρ
p
X
CO
2
p
)
∂t
=
−∇·
[
F
CO
2
|
adv
+
F
CO
2
|
dif
]
+
q
CO
2
(7)
(
φ
p
S
p
ρ
p
X
water
p
)
∂t
=
−∇·
[
F
water
|
adv
+
F
water
|
dif
]
.
(8)
Here
p
denotes the phase of
w
(wetting) or
n
(non-wetting). In the siliciclastic
rocks present at most geological storage sites, water is the wetting phase [32].
However, due to the mutual solubility of water and CO
2
, there is a small
amount of CO
2
in the water phase and a small amount of water in the CO
2
phase. Here
φ
is the porosity,
S
p
is the saturation of phase
p
, and
X
η
p
is the
mass fraction of component
η
in phase
p
.
For both components, the advective mass flux
F
η
|
adv
is obtained by sum-
ming over phases
p
,
F
η
|
adv
=
p
X
η
F
p
=
p
X
η
(
k
k
r,p
ρ
p
μ
p
(
P
p
ρ
p
g
)
)
(9)
where each individual phase flux
F
p
is governed by the multiphase flow exten-
sion of Darcy’s law.
k
denotes the absolute permeability,
k
r,p
is the relative
permeability of phase
p
that non-linearly depends on
S
p
,
μ
p
is the viscosity
of phase
p
that depends on
P
p
, and
g
is the gravitational acceleration.
Due to the effect of capillarity, the fluid pressure
P
p
of each phase is
P
n
=
P
w
+
P
c
(10)
P
w
=
P
w
(11)
where the capillary pressure
P
c
is a non-linear function of
S
p
. Additionally,
porosity
φ
, density
ρ
p
, and the solubility of
CO
2
in Equation 7 and Equation 8
are also non-linear functions that depend on
P
p
.
7
To simplify the problem setting, our simulation does not explicitly in-
clude molecular diffusion and hydrodynamic dispersion. However some un-
avoidable numerical dispersion resulting from approximating spatial gradi-
ents using the two-point upstream algorithm [33] is intrinsic to the numerical
simulations used for the neural network training.
3.2. Numerical simulation setting
We use the numerical simulator ECLIPSE (e300) to develop the multi-
phase flow data set for CO
2
geological storage. ECLIPSE is a full physics
simulator that uses the finite difference method with upstream weighting
for spatial discretization and the adaptive IMplicit method for temporal dis-
cretization [33]. We inject super-critical CO
2
at a constant rate into a radially
symmetrical system
x
(
r,z
) through a vertical injection well with a radius of
0.1 m. The well can be perforated over the entire thickness of the reservoir
or limited to a selected depth interval. We simulate CO
2
injection for 30
years at a constant rate ranging from 0.2 to 2 Mt/year. The thickness of
the reservoir ranges from 12 to 200 m with no-flow boundaries on the top
and bottom. We use a vertical cell dimension of 2.08 m to capture the ver-
tical heterogeneity of the reservoir. The radius of the reservoir is 100,000 m.
The outer boundary is closed, but is sufficiently distant from the injection
well that it behaves like an infinite acting reservoir. Two hundred gradually
coarsened grid cells are used in the radial direction. Grid sensitivity studies
show that this grid is sufficiently refined to capture the CO
2
plume migration
and pressure buildup, while remaining computationally tractable [8]. Simu-
lated values of the gas saturation (
SG
) and pressure buildup (
dP
) fields at
24 gradually coarsening time snapshots are used for training the neural nets.
Refer to Appendix B for detailed spatial and temporal discretizations.
3.3. Variable sampling scheme
We sample two types of variables for each numerical simulation case: field
variables and scalar variables. As shown in Figure 2, field variables include
the horizontal permeability map (
k
x
), vertical permeability map (
k
y
), poros-
ity map (
φ
), and injection perforation map (
perf
). The reservoir thickness
b
is randomly sampled in each simulation case and controls the reservoir
dimension in each of the following field variables:
k
x
: The Stanford Geostatistical Modeling Software (SGeMS) [34] is
used to generate the heterogeneous
k
x
maps. SGeMS produces perme-
ability map according to required input parameters such as correlation
8
Figure 2: Example of mapping between A. input to B. output gas saturation and C.
pressure buildup. A. Field and scalar channels for each case. Note that the scalar variables
are broadcasted into a channel at the same dimension as the field channels. B. Gas
saturation evolution for 6 out of 24 time snapshots. C. Pressure buildup evolution for 6
out of 24 time snapshots.
lengths in the vertical and radial directions, medium appearances (Ap-
pendix C), as well as permeability mean and standard deviation. A
wide variety of permeability maps representing different depositional
environments are included in the data set and the permeability value
ranges widely from 10 Darcy to 0.001 mD. Appendix C summarizes
statistical parameters that characterize the permeability maps.
k
y
: The vertical permeability map is calculated by multiplying the
k
x
map by the anisotropy map. To generate the anisotropy map, values of
k
x
are binned into
n
aniso
materials, each of which is assigned a randomly
sampled anisotropy ratio. Note that the anisotropy ratio is uncorrelated
with the magnitude of the radial permeability. This procedure roughly
mimics a facies-based approach for assigning anisotropy values.
φ
: Previous studies show that porosity and permeability are loosely
correlated with each other [35]. Therefore, to calculate porosity we
first use the fitting relationship presented in Pape et al [35] and then
perturb these values with a random Gaussian noise

with mean value
of zero and standard deviation of 0.005.
9
Table 1: Summary of input variable’s type, sampling range, distribution, and unit. All
input sampling are independent with the exception of porosity and vertical permeability
map. *: refer to Appendix C for a detailed statistical parameter summary for generating
heterogeneous
k
x
map.
variable type
sampling parameter
notation
distribution
unit
field
horizontal permeability field
k
x
heterogeneous*
-
# of anisotropic materials
n
aniso
X
∼U{
1
,
6
}
-
material anisotropy ratio
k
x
/k
y
X
∼U
[1
,
150]
-
porosity (perturbation)
φ

∼N
(0
,
0
.
005)
-
reservoir thickness
b
X
∼U
[12
,
200]
m
perforation thickness
b
perf
X
∼U
[12
,b
]
m
perforation location
-
randomly placed
-
scalar
injection rate
Q
X
∼U
[0
.
2
,
2]
MT/y
initial pressure
P
init
X
∼U
[100
,
300]
bar
iso-thermal reservior temperature
T
X
∼U
[35
,
170]
C
irreducible water saturation
S
wi
X
∼U
[0
.
1
,
0
.
3]
-
van Genuchten scaling factor
λ
X
∼U
[0
.
3
,
0
.
7]
-
perf
: The injection interval thickness
b
perf
is randomly sampled within
the range from 12 m to the specific reservoir thickness
b
of that case.
We placed the perforation interval on the injection well, by randomly
sampling the depth of the perforation top from 0 m to (
b
b
perf
) m.
Visualizations of the above field variable are shown in Appendix C. Table 1
summarizes the parameter sampling ranges and distributions. The sampling
parameters are independent of each other with the exception of porosity and
permeability.
Scalar variables include the initial reservoir pressure at the top of the
reservoir (
P
init
), reservoir temperature (
T
), injection rate (
Q
), capillary pres-
sure scaling factor (
λ
) [36], and irreducible water saturation (
S
wi
). The pa-
rameter sampling range and distributions are summarized in Table 1. While
the scalar variables
P
init
and
T
and determined independently, cases that
yield unrealistic combinations of these variables are excluded. These field
and scalar input variables create a very high-dimensional input space, which
is very challenging for traditional CNN-based models. Nevertheless, the U-
FNO model architecture handles the high-dimensional input space with ex-
cellent data efficiency.
10
3.4. Data configuration
Each of the field variables in Figure 2A is represented by a channel in
the data input. Since we use a gradually coarsening radial grid for the nu-
merical simulations, a logarithm conversion in the radial direction is applied
in training to project the field variables onto a uniform grid that can be
represented by a (96
,
200) matrix. Notice that reservoir thickness is also a
variable and 96 cells represents a 200m thick reservoir. When the reservoir is
thinner than 200 m, we use zero-padding to denote cells that are outside of
the actual reservoir. For the scalar variables, the values are simply broadcast
into a matrix with dimension of (96
,
200).
In addition to the input variables, we also supply the spatial grid infor-
mation to the training by using one channel to denote radial cell dimensions
and another channel to denote vertical cell dimensions. The temporal grid
information is supplied into the network as an additional dimension. The in-
put to each data sample is constructed by concatenating the field variables,
scalar variables, spatial grids, and temporal grid together.
For the gas saturation and pressure buildup outputs as shown in Figure 2B
and C, we use the same logarithm conversion to project the outputs onto a
uniform grid. We then concatenate the outputs for different time snapshots
to obtain a spatial-temporal 3D volume. The pressure buildup is normalized
into zero-mean and unit-variance distribution. For gas saturation, we do
not normalize the data because the saturation values always range from 0
to 1. The dimensions of the input and outputs are shown for in each model
architecture (Appendices D to G).
The data set contains 5,000 input-to-output mappings. We use a 9/1 split
to segregate the data set into 4,500 samples for training and 500 samples for
testing.
3.5. Loss function design and training
We use a relative
lp
-loss to train the deep learning models. The
lp
-loss is
applied to both the original output (
y
) and the first derivative of the output
in the
r
-direction (
d
y
/
d
r
), and is written as:
L
(
y,
ˆ
y
) =
||
y
ˆ
y
||
p
||
y
||
p
+
β
||
d
y
/
d
r
ˆ
d
y
/
d
r
||
p
||
d
y
/
d
r
||
p
,
(12)
where ˆ
y
is the predicted output,
ˆ
d
y
/
d
r
is the first derivative of the predicted
output,
p
is the order of norm, and
β
is a hyper-parameter. This relative
11
loss has a regularization effect and is particularly effective when the data
have large variances on the norms. Our experiments show that, compared
to an
MSE
-loss, a relative loss significantly improves the performance for
both gas saturation and pressure buildup. The second term in Equation 12
greatly improves quality of predictions for gas saturation at the leading edge
of the plume. Similarly this term improves prediction of the sharp pressure
buildup around the injection well. We use the
l
2-loss for gas saturation and
pressure buildup since it provides faster convergence than the
l
1-loss.
As described in Section 3, our data set contains reservoirs with various
thicknesses and the cells outside of the reservoir are padded with zeros for
both input and output. To accommodate for the variable reservoir thick-
nesses, during training, we construct an active cell mask for each data sam-
ple and only calculate the loss within the mask. Our experiments show that
this loss calculation scheme achieves better performance than calculating the
whole field because of the better gradient distribution efficiency.
During training, the initial learning rate is 0.001 and the learning rate
gradually decreases with a constant step and reduction rate. These hyper-
parameters are optimized for the gas saturation and pressure buildup model
separately. The training stops when the loss no longer decreases.
4. Results
This section compares 4 types of model architectures: original FNO pro-
posed in Li et al. [30], the newly proposed U-FNO in this paper, a conv-FNO
that uses a
conv3d
in the place of the U-Net, and the state-of-the-art bench-
mark CNN used in Wen et al. [17]. All models are trained on the proposed
loss function (Equation 12) and directly output the 3D (96
×
200
×
24) gas
saturation and pressure field in space and time. Detailed parameters for each
model are summarized in Appendices D to G.
4.1. Gas saturation
Figure 3A and Table 2 demonstrates that the best performance for both
the training and testing data set is achieved with the U-FNO model. Specif-
ically, the testing set performance represents the true predictability of the
model for unseen data. The low relative loss clearly indicates the superior
performance of the proposed U-FNO.
Interestingly, we notice that although the original FNO has a higher train-
ing relative loss than the CNN benchmark, the testing relative loss by the
12
Figure 3: Training and testing relative loss evolution vs. epoch for U-FNO, FNO, conv-
FNO and CNN benchmark for A. gas saturation and B. pressure buildup.
original FNO is lower than that of the CNN benchmark. This indicates that
FNO has excellent generalization ability and achieves better performance
than the CNN even though FNO has a higher training relative loss. Nev-
ertheless, the original FNO has the highest relative loss in the training set
due to the inherent regularization effect by using a finite set of truncated
Fourier basis. The Conv-FNO and U-FNO architecture is therefore designed
to enhance the expressiveness by processing the higher frequency information
that are not picked up by the Fourier basis. We can observe from Figure 3A
that the training loss is significantly improved even by simply adding a plain
conv3d
in the Conv-FNO case. When the FNO layer is combined with a
U-Net in the U-FNO case, the model takes the advantages of both architec-
tures and consistently produces the lowest relative loss throughout the entire
training (Figure 3A).
Table 2: Average training and testing data set R
2
scores of 500 random samples for
4 models.
R
2
scores measures the similarity between the prediction and the numerical
simulation output; an
R
2
of 1 indicates an exact match.
Model
Gas saturation (
SG
)
Pressure buildup (
dP
)
Train
Test
Train
Test
mean
std
mean
std
mean
std
mean
std
CNN
0.994
0.007
0.986
0.018
0.990
0.029
0.988
0.023
FNO
0.990
0.013
0.986
0.017
0.991
0.014
0.990
0.017
Conv-FNO
0.993
0.011
0.990
0.013
0.992
0.018
0.991
0.018
U-FNO
0.997 0.005
0.994 0.009
0.994 0.014
0.993 0.015
In addition to considering the average performance over the entire train-
ing and testing sets, we compare model predictions for four different cases
with varying degrees of complexity in Figure 4. For each case, Figure 4 shows
13