of 12
Contents
lists
available
at
ScienceDirect
Comput. Methods Appl. Mech. Engrg.
journal
homepage:
www.elsevier.com/locate/cma
Direct
data-driven
algorithms
for
multiscale
mechanics
E.
Prume
a
,
,
C.
Gierden
b
,
M.
Ortiz
c
,
d
,
S.
Reese
a
,
e
a
Institute
of
Applied
Mechanics,
RWTH
Aachen
University,
Mies-van-der-Rohe-Str.
1,
D-52074
Aachen,
Germany
b
Institute
of
Mechanics
of
Materials,
Ruhr
University
Bochum,
Universitätsstrasse
150,
D-44801
Bochum,
Germany
c
Hausdorff
Center
for
Mathematics,
Universität
Bonn,
Endenicher
Allee
60,
53115
Bonn,
Germany
d
Division
of
Engineering
and
Applied
Science,
California
Institute
of
Technology,
1200
E.
California
Blvd.,
Pasadena,
CA
91125,
USA
e
University
of
Siegen,
Adolf-Reichwein-Str.
2a,
D-57076
Siegen,
Germany
A
R
T
I
C
L
E
I
N
F
O
Keywords:
Data-Driven
Computational
mechanics
Multiscale
mechanics
Algorithms
A
B
S
T
R
A
C
T
We
propose
a
randomized
data-driven
solver
for
multiscale
mechanics
problems
which
improves
accuracy
by
escaping
local
minima
and
reducing
dependency
on
metric
parameters,
while
requiring
minimal
changes
relative
to
non-randomized
solvers.
We
additionally
develop
an
adaptive
data-generation
scheme
to
enrich
data
sets
in
an
effective
manner.
This
enrichment
is
achieved
by
utilizing
material
tangent
information
and
an
error-weighted
k-means
clustering
algorithm.
The
proposed
algorithms
are
assessed
by
means
of
three-dimensional
test
cases
with
data
from
a
representative
volume
element
model.
1.
Introduction
In
this
work,
we
consider
two-scale
problems
where
the
finer
scale
is
characterized
by
a
representative
volume
element
(RVE)
that
captures
the
details
of
the
material
microstructure
and
serves
as
a
local
material
description
of
a
macroscopic,
mechanical
boundary
value
problem
(BVP).
This
type
of
material
modeling
is
referred
to
as
upscaling
[
1
,
2
].
In
the
FE
2
method
[
3
],
the
RVE
is
evaluated
at
every
material
point,
time
step
and
possibly
multiple
iterations,
which
can
be
exceedingly
costly.
To
reduce
the
number
of
RVE
evaluations
in
an
inelastic
setting,
[
4
]
developed
a
k-means
clustering
FE
2
approach
where
material
points
of
similar
deformation
behavior
are
grouped
and
jointly
evaluated.
Another
popular
approach
is
to
perform
a
series
of
evaluations
(or
‘‘virtual
experiments’’)
with
an
RVE
model
and
calibrate
a
phenomenological
material
model
to
the
resulting
data
set.
More
recently,
the
material
modeling
is
often
performed
by
specialized
supervised
neural
network
architectures
in
order
to
automatize
and
standardize
the
modeling
procedure
for
elastic
[
5
,
6
]
and
inelastic
[
7
,
8
]
material
behavior,
see
for
instance
[
9
,
10
]
for
a
review.
A
number
of
methods
have
been
developed
in
order
to
reduce
the
number
of
RVE
evaluations
and
improve
the
accuracy
of
the
trained
material
model
[
11
,
12
].
The
advantage
of
the
supervised
approach
is
the
fast
model
evaluation,
while
the
disadvantage
is
that
additional
modeling
uncertainties
are
introduced
by
biased
modeling
assumptions,
including
which
and
how
many
virtual
experiments
are
performed,
how
validation
and
verification
are
performed
and
what
features
and
functional
basis
are
chosen
for
the
model.
Thus,
when
performing
a
predictive
computation,
i.
e.,
a
previously
unseen
material
state,
with
a
calibrated
model,
it
is
often
not
clear
whether
the
local
response
of
the
material
model
needs
to
be
validated
by
an
RVE
evaluation
or
may
be
assumed
to
be
reliable.
In
addition,
the
cost
of
training
the
neural
network
architectures,
while
often
not
accounted
for,
can
be
very
high,
especially
with
large
data
sets
and
high-dimensional
data.
In
order
to
eschew
these
shortcomings
of
supervised
models,
in
direct,
or
unsupervised
,
data-driven
methods
a
pre-existing
material
data
set
is
directly
used
in
the
solution
of
the
boundary-value
problem
[
13
].
In
this
manner,
only
actual
unbiased
RVE
data
are
used
Corresponding
author.
E-mail
address:
erik.prume@ifam.rwth-aachen.de
(E.
Prume)
.
https://doi.org/10.1016/j.cma.2024.117525
Received
30
June
2024;
Received
in
revised
form
31
October
2024;
Accepted
31
October
2024
Computer
Methods
in
Applied
Mechanics
and
Engineering
433
(2025)
117525
Available
online
12
November
2024
0045-7825/©
2024
The
Authors.
Published
by
Elsevier
B.V.
This
is
an
open
access
article
under
the
CC
BY
license
(
http://creativecommons.org/licenses/by/4.0/
).
E.
Prume
et
al.
for
purposes
of
material
characterization.
Evidently,
the
accuracy
of
the
solution
depends
on
the
degree
of
coverage
supplied
by
the
material
data
set,
which
can
be
efficiently
monitored
using
a
distance
function
[
13
15
].
This
quantitative
metric
can
be
used
to
identify
regions
of
insufficient
data
coverage
and
to
develop
strategies
to
acquire
new
data
in
a
systematic
manner.
The
direct
data-driven
method
has
been
extended
to
large
deformations
[
16
,
17
],
inelastic
material
behavior
[
18
,
19
]
and
uncertainty
quantification
[
20
22
].
To
improve
the
accuracy
with
sparse
data
sets,
[
23
,
24
]
include
material
tangent
information
in
the
solving
process.
Furthermore,
the
method
has
been
applied
to
other
field
equations
than
mechanics,
such
as
magnetism
[
25
]
or
electro-mechanical
coupled
problems
[
26
].
In
[
27
],
the
direct
data-driven
method
has
been
reformulated
in
order
to
identify
stress
data
from
a
sequence
of
strain
field
measurements.
Similarly,
[
28
]
proposes
an
automated
discovery
scheme
for
plastic
material
laws
using
only
strain
data.
While
this
paper
aims
to
enable
efficient
multiscale
computations
with
synthetic
microstructures,
the
direct
data-driven
method
has
been
also
used
with
real
data
measurements,
based
on
the
identification
of
two-dimensional
stress
fields
[
29
31
]
using
Digital
Image
Correlation
(DIC)
measurements.
Particularly
challenging,
but
not
impossible,
is
obtaining
strain
data
in
the
three-dimensional
case
using
Digital
Volume
Correlation
(DVC)
measurements
[
32
],
which
in
turn
can
be
used
to
identify
the
corresponding
stress
data.
Furthermore,
domain
specific
measurement
techniques
are
used
to
obtain
real
3D
data,
see
for
example
[
33
]
which
uses
elastography
data
for
a
viscoelastic
biomechanical
problem.
This
paper
has
two
novel,
significant
improvements
over
previous
work.
First,
it
tackles
the
long-standing
problem
of
finding
the
global
optimizer
of
the
double
minimization
problem
[
13
]
which
was
also
reported
in
[
34
36
].
This
problem
is
in
particular
relevant
for
rather
sparse
data
sets
with
three-dimensional,
nonlinear
material
behavior.
We
propose
a
simple
variation
of
the
original
fixed-point
algorithm
that
effectively
helps
to
escape
local
minima
and
reduces
the
dependence
on
the
choice
of
metric
coefficients.
Compared
to
previous
works,
the
particular
advantages
of
the
proposed
approach
are
that
it
is
easy
to
implement
and
does
not
increase
the
computational
effort,
yet
is
effective.
Second,
we
contribute
an
incremental
improvement
for
the
systematic
data
acquisition
problem
in
the
multiscale
context
compared
to
previous
work
[
15
,
37
].
In
particular,
we
show
how
to
exploit
the
material
tangent
information
and
the
intrinsic
error
measure
of
the
direct
data-driven
approach
using
a
weighted
k-means
algorithm.
The
result
is
an
effective
and
consistent
data
set
enrichment
scheme
for
nonlinear
and
path-dependent,
force
and
displacement
driven
problems,
thus
minimizing
the
number
of
costly
RVE
evaluations
in
an
unsupervised
manner.
The
paper
is
structured
accordingly:
in
Section
2
,
we
briefly
review
the
theory
of
direct
data-driven
mechanics;
in
Section
3
the
improved
solver
is
formulated
and
tested
on
a
numerical
example;
in
Section
4
the
adaptive
data
generation
scheme
is
developed
and
evaluated
with
the
aid
of
a
numerical
example.
Finally,
in
Section
5
we
summarize
our
findings
and
provide
an
outlook
for
further
work.
2.
Preliminaries:
Direct
data-driven
mechanics
In
this
section,
we
provide
a
compact
review
of
the
direct
data-driven
computing
paradigm
as
proposed
by
[
13
]
and
the
notation
used
in
the
following
sections.
We
consider
discretized
mechanical
systems
of
material
points,
each
characterized
by
a
strain
state
R
,
a
stress
state
R
and
a
volume
R
.
Hereby,
the
index
indicates
the
specific
material
point
and
is
the
dimension
of
stress
and
strain,
that
is
=
6
in
the
considered
three-dimensional,
geometrical
linear
case.
In
addition,
we
define
the
weighted
Euclidean
norms
2
∶=
C
and
2
∶=
C
−1
in
order
to
account
for
different
units
of
stresses
and
strains.
The
matrix
C
contains
metric
coefficients
whose
influence
is
studied
in
Section
3
.
The
idea
of
the
direct
data-driven
approach
is
to
formulate
the
solution
of
the
boundary
value
problem
explicitly
in
the
given
experimental
stress–strain
data,
which
is
given
in
the
form
of
local
data
set(s)
=
{(
,
)}
=1
(1)
with
material
data
points.
The
direct
data-driven
solution
is
then
defined
as
that
assignment
of
data
points
to
each
material
point
which
fulfills
the
physical
constraints
of
force
equilibrium
and
kinematical
compatibility
simultaneously
in
an
optimal
sense.
We
describe
the
kinematical
compatibility
by
the
set
∶=
{
(
1
,
...
,
)
R
푚푑
=
푢,
=
1
,
...
,
}
(2)
of
all
strain
fields
which
are
derived
from
displacement
fields
R
by
the
geometrical
discretized
gradient
operator
R
×
.
Hereby,
is
the
number
of
degrees
of
freedom
of
the
overall
system
including
boundary
conditions.
Similarly,
the
force
equilibrium
is
described
by
the
set
∶=
{
(
1
,
...
,
)
R
푚푑
=1
=
}
(3)
of
all
stress
fields
which
fulfill
the
discretized
force
equilibrium
given
a
force
field
R
.
Similarly
to
the
local
norms,
we
define
2
∶=
=1
2
and
2
∶=
=1
2
as
the
global
norms
accounting
for
the
different
material
point
volumes
.
Formally,
we
denote
the
set
of
all
possible
assignments
of
data
points
to
material
points
by
=
1
×
2
×
...
×
(4)
and
one
such
assignment
by
(
,
)
=
((
=1
,
...
,
=
)
,
(
=1
,
...
,
=
))
.
We
note
that
Eq. (
2
) and (
3
) constrain
the
strain
and
stress
fields
independently
of
each
other
and
the
coupling
between
the
two
fields
is
only
established
by
the
material
data
set.
Computer
Methods
in
Applied
Mechanics
and
Engineering
433
(2025)
117525
2
E.
Prume
et
al.
Using
these
notations,
the
direct
data-driven
problem
can
be
formulated
by
min
(
,휎
)∈
(
min
2
+
min
2
)
(5)
with
an
optimal
data
assignment
(
,
)
and
its
corresponding,
closest
admissible
strain
and
stress
fields
(
휀,
)
as
the
solution.
For
the
computation
of
the
closest
compatible
strain
field
and
the
closest
equilibrium
stress
field
we
introduce
the
two
projection
operations
(
)
=
ar
g
min
2
and
(
)
=
ar
g
min
2
which
consists
in
solving
two
linear,
decoupled
systems
of
equations
[
r
ed
C
r
ed
]
r
ed
=
r
ed
C
(6a)
[
r
ed
C
r
ed
]
r
ed
=
r
ed
r
ed
.
(6b)
Here
r
ed
is
the
assembled,
discretized
gradient
operator,
or
B-operator
after
removing
the
degrees
of
freedom
associated
to
Dirichlet
boundary
conditions,
=
diag(
0
,
1
,
...
,
)
the
material
point
volumes,
C
=
diag(
C
1
,
C
2
,
...
,
C
)
the
metric
coefficients
and
the
applied
external
force
vector.
From
the
two
solutions
of
Eq. (
6
) the
projection
operations
follow
as
(
)
=
(7)
(
)
=
+
C
.
(8)
Note
that
denotes
the
assembled
B-operator
including
the
degrees
of
freedom
accounting
for
boundary
conditions.
Also,
contains
r
ed
as
well
as
the
applied
Dirichlet
boundary
conditions
and
contains
r
ed
as
well
as
zeros
at
the
constraint
degrees
of
freedom.
For
the
two
projection
operations
involved,
i.e
.
the
computation
of
the
closest
compatible
strain
field
(
)
=
ar
g
min
2
and
the
closest
equilibrium
stress
field
(
)
=
ar
g
min
2
only
two
linear,
decoupled
system
of
equations
have
to
be
solve.
It
is
important
to
realize
that
the
number
of
possible
data
assignments
|
|
=
=1
is
of
combinatorial
complexity
and
a
solution
to
problem (
5
) is
not
directly
computable.
Therefore,
several
heuristic
solver
schemes
has
been
proposed.
The
fixed-point
solver
proposed
by
[
13
]
which
we
call
in
the
following
the
‘‘original
fixed-point
solver’’
is
summarized
in
Alg.
1
.
Algorithm
1
Original
Fixed-Point
Solver
Require:
Constraint
set
=
×
.
Require:
Local
data
set(s)
=
{(
,
)}
=1
.
Require:
Convergence
tolerance
TOL
Initialize:
(
,
)
randomly
Initialize:
0
0
,
while
|
0
|
<
TOL
do
Set:
(
)
Set:
(
)
Set:
0
Set:
1
=1
(
+
)
for
=
1
...
do
Set:
(
,
)
ar
g
min
(
,휎
)∈
(
2
+
2
)
end
for
end
while
return
(
,
)
,
(
휀,
)
3.
Randomized
fixed-point
solver:
An
alternative
solver
scheme
3.1.
Motivation
The
first
contribution
is
an
enhancement
of
the
original
solver
scheme
summarized
in
Alg.
1
,
which
is
beneficial
for
all
kind
of
data-driven
problems
within
the
present
framework.
As
also
reported
in
[
34
,
36
],
the
original
solver
fails
to
find
the
global
minimizer
of
the
optimization
problem,
in
particular
in
the
case
of
sparse
data
sets.
In
addition,
the
accuracy
depends
on
the
metric
parameter
C
,
which
has
to
be
chosen
in
advance.
To
improve
the
accuracy,
[
34
]
proposes
a
mixed-integer
global
optimization
technique,
which
becomes
however
computationally
inefficient
for
larger
systems.
More
closely
to
the
original
algorithm,
[
36
,
38
]
proposes
to
adapt
the
metric
coefficient
online
in
an
adaptive
way
in
order
to
improve
convergence
properties,
which
requires
however
to
assemble
the
stiffness
matrix
and
build
the
data
search
structures
multiple
times
during
the
online
computation
and
additional
local
fitting
procedures.
Computer
Methods
in
Applied
Mechanics
and
Engineering
433
(2025)
117525
3
E.
Prume
et
al.
3.2.
Proposed
method:
Randomized
fixed-point
solver
We
propose
a
variation
of
the
original
algorithm
that
overcomes
the
aforementioned
issues
while
requiring
only
minimal
changes.
The
basic
idea
of
the
algorithm
is
to
perform
the
data
search
separated
for
stresses
and
strains
in
each
material
point
in
a
randomized
manner.
In
contrast
to
the
original
algorithm,
which
assigns
to
each
material
point
the
data
point
that
minimizes
simultaneously
the
distance
of
stresses
and
strains,
our
proposed
algorithm
searches
only
for
the
data
which
is
closest
with
respect
to
either
stresses
or
strains.
If
a
given
material
point
executes
a
strain
search
(and
not
a
stress
search),
it
is
decided
randomly
with
probability
=
1∕2
at
each
iteration.
Since
we
cannot
expect
that
the
error
will
decrease
in
every
iteration,
it
is
necessary
to
adapt
the
convergence
criteria.
The
algorithm
will
be
terminated
if
the
optimal
solution
cannot
be
further
improved
after
a
certain
number
of
iterations.
The
algorithm
is
summarized
in
Alg.
2
.
Algorithm
2
Randomized
Fixed-Point
Solver
Require:
Constraint
set
=
×
.
Require:
Local
data
set(s)
=
{(
,
)}
=1
.
Require:
Maximum
number
of
iterations
with
increasing
error
t
r
ial
Initialize:
(
,
)
randomly
Initialize:
min
min
=
0
,
=
0
while
<
min
+
t
r
ial
do
Set:
(
)
Set:
(
)
Set:
1
=1
(
+
)
if
<
min
then
Set:
min
Set:
min
end
if
Set:
+
1
for
=
1
...
do
Set:
Ber
(1∕2)
#Bernoulli
random
variable
if
r
=
0
then
Set:
(
,
)
ar
g
min
(
,휎
)∈
2
else
Set:
(
,
)
ar
g
min
(
,휎
)∈
2
end
if
end
for
end
while
return
(
,
)
,
(
휀,
)
3.3.
Numerical
example:
Stamp
3.3.1.
Setup
We
compare
the
proposed
randomized
fixed-point
solver
with
the
original
fixed-point
solver
by
means
of
a
numerical
example.
The
considered
boundary
value
problem
is
shown
in
Fig.
1
.
The
data
set
in
this
section
is
set
to
the
reference
solution
using
a
three-dimensional
von-Mises
plasticity
material
model
with
linear
isotropic
hardening
containing
34,606
data
points.
The
material
parameters
are
chosen
as
follows:
Young’s
modulus
=
1000
.
0
[MPa],
Poisson’s
ratio
=
0
.
3
,
hardening
modulus
=
100
.
0
[MPa]
and
initial
yield
strength
0
=
25
.
0
[MPa].
The
shear
loading
̄푢
(
)
=
(10
max
,
0
,
0)
is
applied
in
10
load
steps.
The
finite
element
mesh
consists
of
3146
linear
tetrahedral
elements
in
three
dimensional
space.
3.3.2.
Results
First,
we
investigate
the
influence
of
the
metric
coefficients
C
=
,
where
we
set
the
metric
tensor
proportional
to
the
identity
matrix.
We
consider
a
single
time
step
at
=
0
.
5
푚푎푥
.
The
error
is
defined
by
the
mean
distance
of
the
converged
solution
(
,
)
to
the
reference
solution
(
r
ef
,
r
ef
)
as
r
ef
=
1
=1
(
푒,
r
ef
+
푒,
r
ef
)
using
a
fixed
reference
metric
C
푒,
r
ef
=
1000
so
that
the
error
measurement
is
independent
of
the
metric
parameter
and
thus
comparable
for
different
choices
of
.
Since
the
reference
solution
is
included
in
the
data
set,
we
ideally
expect
any
data-driven
solver
scheme
to
find
a
solution
which
satisfies
equilibrium
and
compatibility
up
to
numerical
precision
of
the
data.
The
results
shown
in
Fig.
2
(A)
depicts
the
dependency
of
the
original
solver
scheme
on
the
metric
coefficient
.
Additionally,
the
original
algorithm
fails
to
find
the
global
minimizer
even
for
an
optimal
choice
of
the
metric.
In
contrast,
remarkably,
the
proposed
randomized
solver
scheme
computes
independently
of
the
chosen
metric
an
almost
optimal
solution.
Performing
30
independent
runs,
the
original
algorithm
achieves
in
the
best
case
for
=
1000
an
error
of
0.55
while
the
randomized
solve
achieves
on
average
over
all
choices
of
an
error
of
2
.
2
10
−5
with
a
standard
deviation
of
2
.
0
10
−4
.
These
measures
are
heavily
influenced
by
the
outliers
which
does
not
find
the
exact
solution:
423
out
of
450
(15
different
metrics
times
30
samples
each)
computations
have
an
error
below
10
16
meaning
that
the
exact
reference
solution
was
found
in
these
Computer
Methods
in
Applied
Mechanics
and
Engineering
433
(2025)
117525
4
E.
Prume
et
al.
Fig.
1.
Stamp
boundary
value
problem.
Fig.
2.
(A)
Influence
of
the
metric
parameter
on
the
original
and
randomized
algorithm.
Shaded
areas
denote
the
maximum
deviations
from
the
mean
of
30
independent
runs.
(B)
Error
in
constraints
during
the
iterations
for
both
algorithms
for
=
1000
.
(C)
Blue:
Local
data
set,
orange:
optimal
data
point
assignment
of
each
material
point.
cases.
Thus,
the
randomized
solver
finds
independently
of
the
metric
coefficient
in
most
runs
the
optimal
solution.
In
Fig.
2
(B)
the
evolution
of
the
error
during
the
iterations
is
shown
for
both
algorithms
for
=
1000
.
While
the
original
algorithm
takes
in
the
first
iterations
large
steps
towards
the
minimum,
it
cannot
effectively
escape
local
minima.
The
randomized
algorithm
starts
slower
and
does
not
necessarily
decrease
the
error
in
each
iteration,
however,
it
has
the
capability
to
compute
almost
optimal
solutions.
Fig.
2
(C)
visualizes
the
local
data
set
and
the
computed
solution
of
optimal
data
assignments
of
each
material
point.
The
randomized
solver
comes
with
no
computational
overhead.
The
computation
time
per
iteration
for
the
original
algorithm
is
on
average
0.034
[s]
and
0.025
[s]
for
the
randomized
algorithm.
One
explanation
for
this
is
that
the
data
search
in
the
Computer
Methods
in
Applied
Mechanics
and
Engineering
433
(2025)
117525
5