Journal Pre-proof
A shared structure for emotion experiences from narratives, videos and everyday life
Yanting Han, The COVID-Dynamic Team, Ralph Adolphs
PII:
S2589-0042(24)01603-1
DOI:
https://doi.org/10.1016/j.isci.2024.110378
Reference:
ISCI 110378
To appear in:
ISCIENCE
Received Date:
7 December 2023
Revised Date:
3 May 2024
Accepted Date:
24 June 2024
Please cite this article as: Han, Y., The COVID-Dynamic Team, Adolphs, R., A shared structure
for emotion experiences from narratives, videos and everyday life,
ISCIENCE
(2024), doi:
https://
doi.org/10.1016/j.isci.2024.110378
.
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition
of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of
record. This version will undergo additional copyediting, typesetting and review before it is published
in its final form, but we are providing this version to give early visibility of the article. Please note that,
during the production process, errors may be discovered which could affect the content, and all legal
disclaimers that apply to the journal pertain.
© 2024 Published by Elsevier Inc.
A shared structure for emotion experiences from narratives, videos and
everyday life
Yanting Han
1
,3
†
, The COVID
-
Dynamic Team* and Ralph Adolphs
1,2
1
,
Division of Humanities and Social
Sciences, California Institute of Technology, Pasadena, CA
91125
,
USA
2
,
Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA
91125
, USA
3
Lead contact
*See
A
cknowledgement for contributions.
†
Corresponding aut
hor: Yanting Han
Email:
yhhan@caltech.edu
Journal Pre-proof
Summary
Our knowledge of the diversity and psychological organization of emotion
experiences
is based primarily
on studies that used a single type of stimulus with an often limited set of rating scales and
analyses
.
Here
we take a comprehensive data
-
driven approach. We
surveyed 1000+ participants
on
a diverse set of
ratings of emotion experiences
to a validated set of ca. 150 text narratives, a validated set of ca. 1000
videos, and over 10,000 personal experiences sampled longitudinally in everyday life, permitting a unique
comparison. All three types of emotion experiences were characterized by
s
imilar
dimensional spaces
that included valence and arousal, as well as dimensions related to generalizability. Emotion experiences
were distributed along continuous gradients, with no clear clusters even for the so
-
called basic emotions.
Individual differ
ences in personality traits were associated with differences in everyday emotion
experiences, but not with emotions evoked by narratives or videos.
Journal Pre-proof
Introduction
Emotions are ubiquitous
1
and their subjective experience is a highly salient aspect of our lives
2
.
1
Emotions
interact with
many
other psychological processes
, such as memory a
nd decision
-
making,
and
2
are a key feature of abnormal functioning in psychiatric disorders
3,4
. Despite their patent importance, the
3
scientific understanding of emotions has been modest so far.
In good part this arises from heterogeneity
4
and disagreements in the literature. One likely reason for the disparate fin
dings is that different studies
in
5
fact study quite different phenomena (such as concepts, experiences, or effects on cognition),
use
6
different
analytic
approaches and, notably, different stimuli for inducing emotions in the first place. As we
7
review below
, each stimulus type in isolation has limitations. We therefore sought to compare emotion
8
experiences
evoked, in the same set of participants, across three very different stimulus types: narratives,
9
videos, and everyday life.
10
One core debate regarding the
structure of emotion
experience
can be roughly summarized as
11
dimensional
vs. categorical. The dimensional perspective typically proposes “core affect,” which consists
12
of two dimensions: valence and arousal. Additional components can then be added to accoun
t for the full
13
richness and diversity of human emotion experience
5
-
7
. In contrast, classical basic emotion theory argues
14
that emotions are best described by only a few basic
and more categorical
emotions, such as happiness,
15
surprise, fear, anger, disgust, and sadness
8,9
. Recent revised versions of the theory propose a more
16
comprehensive taxonomy, perhaps encompassing around 20 discrete kinds of emotions
10
. At its core,
17
classical basic emotion theory predicts unique diagnostic patterns
of subjective experience, bodily
18
acti
vation, and neural activity for each basic emotion category with consistency and specificity
11
.
The
19
varied evidence for each of these different views comes from studies that have individually used different
20
types o
f stimuli.
21
There is a large literature on the self
-
reported subjective experience of emotion, elicited by a variety of
22
stimuli
12
. Across the many studies that have used
the
simplest
lexical stimuli
,
single emotional words,
23
valence and arousal
have been identified
most consistently as the two dimensions that characterize
24
emotions
6
.
Disagreements on the number and interpretations of dimensions in man
y cases can be
25
largely attributed to the use of different scales and words
6,13
-
16
. However,
i
t seems
unlikely
that reading a
26
single word would induce a potent emotion, and that, instead, participants default to providing what the
27
word is supposed to represent
–
that is, the word’s conceptual meaning.
Therefore, it can be argued th
at
28
these studies using single words measured the conceptual semantic space of the emotional words, rather
29
than people’s actual emotion experiences.
In contrast, n
arratives provide a more vivid description of an
30
emotional situation
and are likely
more specific and effective at eliciting emotion
experiences
17
.
In the
31
present study, we used one
such set
that had been used
in a prior study
to
characterize
the patterns of
32
evoked brain activations
18
.
33
More
potent inducers of emotion experiences than either words or narratives would be audiovisual stimuli.
34
Videos such as
episodes from
films
are commonly used to elicit emotions, but are often limited to a small
35
number of prototypical
emotion
examples belonging
to a few categories
19,20
.
Moreover
, participants
are
36
typically
asked to
rate just
the intended emotion categories, but not a comprehensive set of scales
21
-
24
.
37
The results thus
test
primarily for the
successful elicitation of
a priori
defined target emotions (as defined
38
by
the experimenter), but
cannot provide
strong evidence
one way or the other about the
categorical
39
nature of emotion experiences
25
since the stimuli lack the number and diversity required. However,
there
40
are importa
nt exceptions to this. For instance,
Cowen and Keltner recently
developed
a new set of short
41
videos targeting 34
pre
-
defined
emotion categories
10
, a substantial improvement over previous sets in
42
terms of diversity and quantity of stimuli (there are over
2
000 videos in their database).
The present study
43
used as its second set of stimuli a selection of
these same videos.
44
Although various kinds of stimuli have been shown to be effective
at
inducing emotions in the lab,
45
sampling people’s subjective experiences in daily life is
of course far
less constrained and
comes with
46
greater ecological validity
26
.
Prior studies have examined the prevalence of certain emotions in
daily
life
47
by asking participants to indicate the pre
sence of emotion categories and found that
,
in general, positive
48
Journal Pre-proof
emotions are more pronounced than negative ones
, at least in terms of how they are reported
1,27
.
In
49
addition, early studies
of real
-
world emotions
identi
fied valence and arousal as the two dominant features
50
of momentary affect consistently
28
-
30
. We a
ttempted to extend and improve on prior work by assessing
51
the experiences of a larger and diverse sample of participants longitudinally using a more comprehensive
52
set of scales
and during a particularly turbulent time (the COVID pandemic and its associated
events)
.
53
The studies thus far do not provide a consensus on the structure of subjective emotion experience
31
-
33
.
54
Importantly, the constraints of specific studies are likely to interact in producing divergent findings that
55
have continued to fuel debates in the literature. For instance,
discr
ete emotions have been reported to
56
account for variance that is not captured by the standard valence and arousal scales
–
but on studies of
57
response times to
single
words
34
, which are a very different dependent measure
in a very different task
58
context
than self
-
report of
subjective
feelings. Of course, all stimuli can be rated on disc
rete emotions
,
59
and
such
databases
(including the ones we used for narratives
18
and videos
10
, as well as a number of
60
others for single words in a range of different languages
35
-
39
)
provide complementary characterizations of
61
st
imuli on discrete emotion labels beyond standard affective dimensions such as valence and arousal.
In
62
fact,
we included both discrete emotions and dimensional features
in the present study
and asked
63
whether such emotion labels and the concepts they denote, when applied specifically to emotion
64
experience as induced by
potent
stimuli, reveal a more continuous dimensional space, or discrete
65
clusters.
66
It is important to note that instances of emotion categories can be integrated into a dimensional
67
framework when they are represented as points in a dimensional space
40
. A dimensional approach does
68
not
at all preclude the discovery
of categories; rather, it offers a quantitative framework to test
for
69
evidence of
categories
–
are they objective categories represented as discrete clusters wi
th firm
70
boundaries or are they best understood as more conceptual or conventional categories that lack a data
-
71
driven basis
41
?
We used precisely this approach in our study, quantifying emotion experiences in a
72
dimensional space provided by our
rating scales, and then testing whether the results showe
d any
73
evidence of clusters that might constitute discrete categories.
74
To cast a broad net o
n the
features that
might
characterize emotion experiences, we
drew from
a larger
75
set of
rating
words from three main sources: standard emotion terms (e.g., the name
s for the so
-
called
76
basic emotions, such as happiness and sadness
9
), a
ffective
features (such as valence, self
-
relevance)
77
10,15
,18
as well as biologically
-
inspired attributes (such as generalizability and persistence)
42
.
We were
78
particularly interested
to see
whether the latter
, biologically
-
inspired
,
features
might
reveal
additional
79
dimensions of variability.
Importantly, we wanted to ensure that our rating terms captured as diverse a
80
range of judgments as possible
(while being non
-
redundant and clear in their meaning). To this end
,
we
81
began with a broadly sampled set of
around
70 terms
and
eventually reduced
these
to 28 rating scale
82
labels
which we verified to be representative of the initial set
.
We also verified t
hat these 28 scales we
83
used were in fact relatively high
-
dimensional in terms of their semantic meaning, and thus did not
84
artificially constrain the dimensionality of the emotion experiences that they were used to rate (see STAR
85
methods
for details).
86
We us
ed three types of emotion
-
eliciting stimuli: validated narratives
18
and video clips
10
from prior
87
studies
(sampled so as to maximize their diversity;
see
STAR methods
)
, as well as a rich array of real
-
life
88
experiences sampled
longitudinally
during the COVID pandemic
43
. We note that the three stimulus types
89
differ in their degree of apparent selection bias.
The narratives were constructed from scratch targeting 20
90
emotion categories and therefore can be seen as the most stereotypical stimuli
for each pre
-
defined
91
emotion category. The videos were curated from the internet with 34 emotion categories in mind, again
92
leading to a set of
relatively
stereotypical
examples
.
The real
-
world emotion, however, aimed to capture
93
the full unbiased range of
variation observed in everyday life
.
We
do
note here that, as with any study, our
94
results are of course
still
limited both by the sampling of stimuli and of rating terms.
95
We started by quantifying the pairwise correlations between
rating scales
and asked
whether the
96
similarity structure was shared across the different types of
inducing
stimuli using representational
97
similarity analysis (RSA). Given the correlations across
rating
scales, we further used exploratory factor
98
analysis (EFA) to derive a small nu
mber of interpretable dimensions that capture most of the variance in
99
Journal Pre-proof
the original high
-
dimensional space
. W
e
then
probed the distribution of emotion experiences
using
100
dimensionality embedding
and clustering techniques to visualize and identify clusters of
emotions. Finally,
101
enabled by the rich set of psychological background assessment
s
in our
participants
, we
provide a
102
preliminary exploration of
how individuals
might
differ in their emotion experiences
evoked by our three
103
stimulus types
as a function of d
emographic and personality factors
.
Importantly, our aims to be as
104
comprehensive as possible were motivated by a strongly data
-
driven approach. We did not set out to test
105
any specific
hypotheses or
emotion
theory
,
i
nstead, we aimed to let the diversity of
stimuli, ratings, and
106
analyses speak for themselves
.
107
Journal Pre-proof
Results
Similar representational structures across stimulus domains.
108
We assessed within
-
subject consistency (with Pearson correlations) and between
-
subject consensus
109
(with split
-
half reliability)
across scales (Fig
ure
2, also see Fig
.
S
3 for full details). The pattern across
110
scales was robust across experiment sessions and across evaluation metrics. Readability as evaluated
111
using grade levels did not correlate significantly with scale quality as e
valuated using median test
-
retest
112
reliability(r =
-
0.19, p = 0.324) and median split
-
half reliability (r =
-
0.21, p = 0.295).
113
After excluding five scales due to their low reliability (
see
STAR methods
), we derived pairwise Pearson
114
correlation matrices acro
ss the remaining scales for emotions evoked by narratives, emotions evoked by
115
videos and real
-
life emotions respectively (Fig
ure
3). We observed strong correlations across scales,
116
suggesting that dimensionality could be reduced to represent the psychologi
cal space more efficiently.
117
Most consistently, across all of the stimulus domains, two correlated groups of scales emerged: those
118
scales whose higher ratings indicate emotions to be more negatively
valenced (
afraid, worried, physical
119
disgust, angry, moral
disgust
) and those scales whose higher ratings indicate emotions to be more
120
positively
valenced
(
valence, happy, safety, fairness
).
Note that most scales were either not committed to
121
any particular valence or bivalent, thus capturing both negative and posi
tive valence.
122
Figure 3 further suggested that emotion experiences across stimulus types shared a similar broad
123
correlation structure. To make a formal comparison across all three stimulus types, we used only those 18
124
rating scales that were shared across a
ll three stimulus domains,
confirming the visual impression
that the
125
representational structure was highly consistent across stimulus types (Fig
.
S
4; second
-
order rs = 0.953,
126
0.923 and 0.909 for narratives and videos, narratives and real
-
life, and videos a
nd real
-
life respectively
127
(all ps < 0.0001)).
128
Low dimensional spaces underlie emotion experience.
129
To represent the psychological space using a smaller number of underlying factors, we performed
130
exploratory factor analysis.
The number of factors (dimensi
ons) to retain is well known to be
131
indeterminate, and a number of metrics are commonly used in the literature: the
Very Simple Structure,
132
Empirical BIC, Velicer’s MAP, Parallel analysis, the acceleration factor, and the optimal coordinate
metric
,
133
all of
which we used
.
However, t
hese criteria do not always agree since they are based on different
134
assumptions, and we therefore prioritized an entirely data
-
driven approach:
empirical
cross
-
validation
135
where we applied exploratory factor analysis to half of the
data and confirmatory factor analysis to the
136
other half
. To ensure the robustness of our results, we also systematically decimated both the
number of
137
stimuli and the number of scales, testing for the stability of the results if the analysis was re
-
done on
a
138
randomly sampled subset of the data. Finally, we considered the interpretability of the final results (see
139
STAR methods
for complete description of our analyses).
140
For emotions evoked by narratives, results from the six commonly used statistical tests
suggested either
141
2 or 3 factors to retain. The
data
-
driven
cross
-
validation procedure (Fig. S5a)
showed
that two factors
142
were most appropriate: there
was
a significant i
mprovement
in explained variance from EFA
as well as
143
model fit from CFA
as the number o
f factors increased from 1 to 2,
but
adding
further
factors
144
subsequently showed
only
marginal improvement. We
next
assessed the robustness of the 2 factor and 3
145
factor solutions and found both to be robust with regard to the number of stimuli and number of
scales
146
(Fig. S6a). We thus decided to retain 3 factors for completeness.
147
For emotions evoked by videos, results from the six commonly used statistical tests
again
did not
148
converge to a single number, but suggested 1 or 4 or 6 factors to retain. The cross
-
validation procedure
149
suggested that 2 to 4 factors all seemed reasonable (Fig. S5b). We further assessed the robustness of
150
the 3 factor and 4 factor solutions (Fig. S6b). Both were robust with regard to the number of stimuli, but
151
the 4 factor solution wasn
’t robust with regard to the number of scales (the “safety” factor was unstable).
152
We thus decided to retain 3 factors.
153
Journal Pre-proof
For real
-
life emotions, results from the six commonly used statistical tests did not converge to a single
154
number, but suggested 1 or 2 or
6 or 8 factors to retain. The cross
-
validation procedure suggested that 2
155
to 5 factors all seemed reasonable (Fig. S5c). We further assessed the robustness of the 3 factor and 4
156
factor solutions (Fig. S6c). Both were robust with regard to the number of st
imuli, but the 3 factor solution
157
wasn’t robust with regard to the number of scales (the “negative affect” factor was unstable). We thus
158
decided to retain 4 factors.
159
We then used exploratory factor analysis to extract
three
, three, and four factors
for the narrative
-
evoked,
160
video
-
evoked, and real
-
life emotions, respectively,
using the minimal residual method; solutions were
161
rotated with oblimin for interpretability (see Fig
ure
4 for factor loadings). For emotions evoked by
162
narratives, the three facto
rs, interpreted as “valence”, “arousal” and “generalizability”, each explained
163
48%, 21% and 13% of the common variance in the data (82% in total, 84% in total if four factors were
164
extracted). For emotions evoked by videos, the three factors, interpreted ag
ain as “valence”, “arousal”
165
and “generalizability”, each explained 41%, 24% and 15% of the common variance in the data (81% in
166
total, 83% in total if four factors were extracted). Finally, for real
-
life emotions, the four factors, interpreted
167
as “valence”,
“negative affect”, “arousal”, and “common”, each explained 25%, 13%, 10%, and 5% of the
168
common variance in the data (54% in total, 58% in total if five factors were extracted).
T
hese quantitative
169
results demonstrate
the validity of
a low number of about 3
dimensions
in our data
, as going to additional
170
dimensions provides only incremental improvement
s
in
explained variance
.
171
Note that the “valence” and “arousal” factors were identified across all stimulus domains, in line with
172
previous literature
6
. The “generalizability” factor is novel to the best of our knowledge. The “common”
173
factor describes how common and persistent emotions are and is probably related to the “generalizability”
174
factor, which we did no
t properly assess in real
-
life emotions (the specific scales with highest loadings on
175
the generalizability factor were not included for assessing real
-
life emotions
as we deemed them to be
176
inapplicable
).
177
A specific set of scales (
disgust, fear, surprise, a
nd anger
) loaded strongly onto the “negative affect”
178
factor
that was uniquely seen for the real
-
life emotions
. One interpretation is that this factor was specific
179
to
experiences during
the COVID pandemic (and all the other stressors associated with it) dur
ing which
180
we sampled real
-
life emotions. The temporal pattern of the factor across
the longitudinal
waves
of data
181
collection
(Fig
. S
7) revealed a general
ly
decreasing trend (in contrast to the severity of the pandemic
182
measured by deaths and cases) which may indicate that people adapted to the pandemic over time as
183
restrictions relaxed and mask use increased. In addition, there was a notable peak around wave
7 which
184
was the closest data collection we had (June 6
-
7, 202
0
) to the incident of George Floyd’s death. We
185
observed differences on this factor with respect to gender, geographic region, and political party
186
affiliation. Females, people residing in the West
and Northeast, and Democrats scored higher than males,
187
those in the Midwest and South, and Republicans (Fig
.
S
8).
188
Noting the consistency of the overall correlation structure and the semantic similarity of the factors across
189
stimulus types, we directly te
sted the idea of shared latent factors across stimulus domains (see Fig
.
S
9
190
for the determination of the number of factors that would best explain the shared structure). Using the
191
correlation matrices across 18 shared scales, we extracted two factors from
each of the three types of
192
data and quantified the relatedness of the factors by calculating factor congruence (Fig
ure
4d). A two
-
193
dimensional structure of emotion experience was indeed consistent across stimulus types as indicated by
194
high levels of factor
congruence.
195
To address
possible
concerns
that
our results regarding
the
number and
nature of
affective
dimensions
196
may
depend on the
use of
exploratory factor analysis, we further analyzed our data using several
197
dimensionality reduction techniques commonl
y used in prior work
(see
STAR
methods
for details)
10,44,45
.
198
All methods
(Principal components analysis (PCA), autoencoders with cross
-
validation, and Principal
199
preserved component analysis (PPCA))
suggested a small number of dimensions to keep on the basis of
200
cumula
tive proportion of variance explained (Fig. S16). We further note that the nature of the first few
201
dimensions was highly consistent across
these different
analytic methods.
202
Journal Pre-proof