Heterogeneity of Gain-Loss Attitudes and
Expectations-Based Reference Points
Lorenz Goette
∗
University of Bonn
National University of Singapore
Thomas Graeber
†
Harvard University
Alexandre Kellogg
‡
UC San Diego
Charles Sprenger
§
UC San Diego
This Version: May 20, 2020
Abstract
This project examines the role of heterogeneity in gain-loss attitudes for identi-
fying models of expectations-based reference dependence (Kőszegi and Rabin, 2006,
2007) (KR). Different gain-loss attitudes lead to different signs for KR comparative
statics. Failure to account for the known heterogeneity in gain-loss attitudes is a cen-
tral confounding factor challenging prior tests of the KR model conducted under the
assumption of universal loss aversion. We document heterogeneous treatment effects
over gain-loss types in both an initial experiment and an exact replication. Rec-
ognizing heterogeneity over types allows us to both recover the KR model’s central
predictions, and account for inconsistency across prior empirical tests.
JEL classification:
D81, D84, D12, D03
Keywords
: Reference-Dependent Preferences, Rational Expectations, Personal Equilib-
rium, Endowment Effect, Expectations-Based Reference Points
∗
University of Bonn, Institute for Applied Microeconomics, Adenauerallee 24 - 42, 53113 Bonn, Ger-
many; lorenz.goette@uni-bonn.de. National University of Singapore, Department of Economics, 1 Arts
Link, Singapore 117570.
†
Harvard University, Department of Economics, Littauer Center, 1805 Cambridge St, MA 02138; grae-
ber@fas.harvard.edu
‡
Department of Economics, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA
92093; alexkellogg@ucsd.edu
§
Department of Economics and Rady School of Management, University of California, San Diego, 9500
Gilman Drive, La Jolla, CA 92093; csprenger@ucsd.edu
1
1 Introduction
Models of reference-dependent preferences are regarded as a major advance in behavioral
economics, rationalizing a range of observations at odds with the canonical model of ex-
pected utility over final wealth (Kahneman et al., 1990; Camerer et al., 1997; Odean,
1998; Rabin, 2000). Critical to such applications is the formulation of the reference point
around which gains and losses are encoded. A recent literature has examined characteriza-
tions of the reference point based on rational expectations of potential outcomes (Kőszegi
and Rabin, 2006, 2007) (henceforth KR).
1
These expectations-based formulations have the
promise to be readily and broadly applicable, closing the model with a foundation to which
economic tools are already well adapted.
Despite the promise of the KR formulation of the reference point, tests of the theory have
yielded mixed results (see, e.g., Smith, 2019; Ericson and Fuster, 2011; Heffetz and List,
2014; Cerulli-Harms et al., 2019; Abeler et al., 2011; Gneezy et al., 2017). While early ex-
perimental applications in exchange behavior and effort provision showed treatment effects
in line with KR comparative statics, subsequent replications and extensions have shown
more limited or contradictory effects. A plausible interpretation from this literature is that
the KR model of expectations-based reference points lacks a strong empirical foundation.
This manuscript begins by recognizing that these experimental tests of the KR model have
actually been tests of a joint hypothesis. Explicitly, they are posed as tests of the hypothesis
that reference points are derived from expectations. Implicitly, all prior empirical tests
of the KR model have been conducted under a specific assumption for the model’s key
behavioral parameter of gain-loss attitudes: that individuals are universally loss-averse
— weighting losses more than commensurately-sized gains. The conducted tests have
thus investigated the joint hypothesis that reference points are based in expectations
and
that all subjects are loss-averse. The mixed empirical evidence noted above can be read
1
Our analysis will focus on the formulations of KR. An earlier literature also provided formulations of
reference dependence grounded in rational expectations, but without the equilibrium concepts we analyze
(Bell, 1985; Loomes and Sugden, 1986).
2
as rejecting expectations-based reference points, or as rejecting the notion that gain-loss
attitudes are universal. We explore the latter interpretation and provide the first evidence
of expectations-based reference dependence after splitting the joint hypothesis into its
component parts.
There are two reasons why the “non-universal gain-loss attitudes” interpretation of the prior
evidence is plausible. First, heterogeneity in gain-loss attitudes has a critical influence
on what can be inferred from prior empirical tests of the KR model. KR comparative
statics used as the basis of prior experiments change sign when individuals are gain-loving
— weighting gains more than commensurately-sized losses — rather than loss-averse. If
loss aversion is not a universal characteristic, prior empirical tests have unintentionally
aggregated these different signed effects. Given the nature of this aggregation, which we
discuss in detail in section 2.2, the average treatment effect will not coincide with the
treatment effect of the average preference, and may be of opposite sign to that of the
average preference. Even when the signs of the average treatment effect and that of the
average preference coincide, powering a test for the average treatment effect is challenging.
Our findings indicate that well-powered experiments for detecting average KR treatment
effects require sample sizes around an order of magnitude larger than current practice.
Second, heterogeneity of gain-loss attitudes is a documented phenomenon, becoming more
widely appreciated in recent years. Distributional assessments identify sizable minorities
of gain-loving subjects (see, e.g. Burks et al., 2009; Sprenger, 2015; Erev et al., 2008;
Harinck et al., 2007; Nicolau, 2012; Sokol-Hessner et al., 2009; Chapman et al., 2017, 2018).
Reviewing eight previous experimental studies, Chapman et al. (2018) report a weighted
average of 22 percent gain-loving subjects.
2
A natural psychology for this documented
gain-loving preference is that individuals enjoy potential gains enough to tolerate potential
losses. When given an opportunity to exchange an endowed object or purchase a negative
expected value gamble, such an individual may exhibit an ‘anti-endowment effect’ or risk
tolerance, rather than the standard behavioral patterns.
2
Seven of the eight papers noted elicit gain-loss attitudes through lottery choice. The exception is the
prior version of this manuscript.
3
One may be tempted to eschew the minority of gain-loving subjects as reflecting decision
errors or natural variability of response. Such a view is challenged by prior work indi-
cating that individual differences in measured gain-loss attitudes from lottery choices are
predictive of anomalies in labor supply and exchange decisions (see, e.g., Fehr and Goette,
2007; Gachter et al., 2007; Dean and Ortoleva, 2015).
3
Our project proceeds under an
opposing view: that gain-loving behavior is indicative of true preferences, and represents
a key dimension of heterogeneity in gain-loss attitudes heretofore ignored by tests of the
KR model.
We implement an experimental study of exchange behavior in an initial sample of 607
subjects and a pre-registered replication sample with a further 417 subjects.
4
Our experi-
ment has two stages. In Stage 1, subjects are randomly endowed with one of two objects.
Though no choices are made, subjects are asked to provide a series of preference statements
for both objects given this endowment. Specifically, we ask subjects for their hypothetical
choice between the two objects; how much they ‘want’ each object on a nine-point scale;
and how much they ‘like’ each object on a nine-point scale.
The Stage 1 preference statements allow us to form a taxonomy of gain-loss types, con-
structed from a structural model of the preference statements for the endowed and al-
ternative object.
5
Our structural model fits a distribution of gain-loss attitudes to the
3
Additionally, the view that gain-loving behavior is mere noise generates a null prediction for the central
tests documented in this manuscript.
4
In Appendix Section A we provide analysis laid out in our initial draft and pre-registered replica-
tion. This analysis differs from our current presentation of the results as we have adopted a mixed logit
methodology for identifying gain-loss attitudes rather than our prior standard logit methods. Using these
prior methods, the results from our initial study and replication are closely in line. Section 4.3.2 provides
additional discussion and Table 6, which replicates the results using our current methodology, as well.
5
An alternative design would attempt to measure gain-loss attitudes either through small stakes risk
aversion or some other choice. Such tests would face a number of challenges, requiring both additional
assumptions (e.g., about the correlation between intrinsic utilities and gain-loss parameters across contexts)
and additional experimental choices. Recognizing both the polluting potential of such choices and the
challenge of modeling the full body of experimental behavior through the lens of the KR model (Sprenger,
2015), we believe there is substantial value in our method. In section 4.2, we show predictive power for
our measure of gain-loss attitudes and exchange behavior in a standard exchange paradigm. Subjects we
classify as loss-averse exhibit an endowment effect in their actual choices, and subjects we classify as gain-
loving exhibit an anti-endowment effect. Of course, failure to correctly categorize gain-loss types should
lead to a lack of predictive validity in Stage 2 of the experiment, working against these and other identified
results.
4
distribution of choices using standard mixed logit methods, and yields an expected value
of the key gain-loss parameter for every individual given their Stage 1 statements. We pro-
vide complementary, reduced-form evidence to ensure our findings are not driven by our
structural assumptions. We also explore an alternate interpretation of Stage 1 behavior:
that heterogeneity in intrinsic values rather than gain-loss attitudes drives choice. Both
structural estimates yield point predictions for our critical Stage 2 tests of the KR model.
In Stage 2, subjects are endowed with one of two objects, both completely different from
those used in Stage 1, and are randomly assigned to one of two conditions. One group
of subjects is assigned to Condition B, a baseline endowment effect condition, where they
decide whether they would like to exchange their object for the alternative. Another group
of subjects is assigned to Condition F, where they decide whether to exchange their object
under a probabilistic forced exchange mechanism akin to Cerulli-Harms et al. (2019). With
probability 0.5, regardless of their decision, exchange will be forced.
Under the KR model, loss-averse subjects should be more willing to exchange when prob-
abilistically forced in Condition F than in Condition B. Intuitively, exchanging eliminates
the potential loss associated with attempting to retain the endowed object in Condition F
and being forced to exchange anyways. A loss-averse subject who is unwilling to exchange
in Condition B may be willing to do so in Condition F. In contrast, gain-loving subjects
should exhibit the opposite pattern, growing less willing to exchange in Condition F rela-
tive to Condition B. Intuitively, not exchanging in Condition F creates potential gains and
losses associated with probabilistic forced exchange, and the latter outweighs the former.
A gain-loving subject who is willing to exchange in Condition B may be unwilling to do so
in Condition F. We examine heterogeneous treatment effects in Stage 2 over the gain-loss
attitudes measured and estimated in Stage 1. Stage 2 behavior is also analyzed under
the interpretation that Stage 1 preference statements are driven by heterogeneity in utility
rather than gain-loss attitudes.
We document three key findings. First, on average subjects do appear to prefer their ran-
domly endowed object in Stage 1, indicating an endowment effect in preference statements.
5
Fifty-seven percent of subjects state they would choose their randomly endowed object,
and two-thirds provide weakly higher wanting and liking ratings for their endowed object.
These preference statements are highly correlated with each other: all pairwise correla-
tions exceed 0.7. Despite the regularity of an endowment effect in preference statements,
choices also exhibit marked heterogeneity. Roughly 25 percent of subjects’ hypothetical
choice, wanting rating, and liking rating
all
indicate an anti-endowment effect. Our pri-
mary structural model interprets the distribution of choices as driven by heterogeneous
gain-loss attitudes. This model identifies a distribution of the gain-loss parameter,
λ
, with
loss aversion,
λ >
1
, on average but substantial heterogeneity. Indeed, the fitted distribu-
tion estimates 38 percent of individuals as gain-loving,
λ <
1
. The alternative structural
model, attributing the same distribution of behavior to heterogeneity in intrinsic utili-
ties with homogeneous gain-loss attitudes, indicates universal loss aversion and substantial
heterogeneity in intrinsic utilities. Interestingly, the model with heterogeneous gain-loss
attitudes has superior penalized likelihood, indicating better in-sample fit.
Second, in Stage 2 a substantial endowment effect exists in Condition B. Thirty-eight
percent of subjects choose to exchange their randomly endowed object, which deviates
significantly from the neoclassical benchmark of 50 percent exchange. On average, prob-
abilistic forced exchange has a null effect, reducing exchange by 0.4 percentage points in
Condition F. The aggregate null effect in our sample of over 1000 observations contradicts
the predictions of the KR model under the standard assumption of universal loss aversion.
Third, Stage 2 behavior differs substantially by previously measured gain-loss attitudes.
Loss-averse subjects are markedly less willing to exchange than gain-loving subjects in
Condition B. This intuitive relationship between
λ
and behavior in a standard exchange
paradigm validates our methodology for identifying gain-loss attitudes. In Condition F
the relationship between gain loss attitudes and exchange behavior reverses, leading to a
heterogenous treatment effect over the previously measured types. Loss-averse subjects
are more willing to exchange in Condition F relative to Condition B, while gain-loving
subjects are less-so. The sign and magnitude of our heterogeneous treatment effects are
6
closely in line with the predictions of the KR model. In contrast, interpreting Stage 1
behavior as driven by heterogeneous utilities rather than heterogeneous gain-loss attitudes
delivers zero predictive power.
6
The findings described above also hold for alternate methodologies for identifying gain-
loss attitudes. In particular, we provide a reduced-form exercise which infers gain-loss
attitudes based upon residualized Stage 1 behavior. There again, individual differences in
gain-loss attitudes closely relate to differences in Stage 2 treatment effects. Additionally,
the findings provided for our joint data set replicate in both of our samples. Appendix A
details exactly our pre-registered analysis, which featured centrally in prior drafts of this
manuscript, and shows close correspondence in the heterogeneous treatment effects in our
initial and replication studies.
7
Table 6 in the main text also shows such reproducibility.
Our identified heterogeneity of gain-loss attitudes, respected in heterogeneous treatment
effects over gain-loss types, carries important implications for interpreting the body of
experimental evidence on expectation-based reference dependence. We show that even
with over 1000 collective observations, we are dramatically underpowered to identify the
theoretical average treatment effect in our experiment. The theoretical average treatment
effect is around 5.9 percentage points, a value that we calculate would require a sample
size of around 2250 observations to estimate with 80 percent power. Prior experimental
tests focused on average treatment effects are likely to be similarly underpowered.
Our results add two key points to the discussion on the source of reference points and the
nature of reference-dependent preferences. First, given inconsistent findings across prior
studies, our null aggregate effect in a sample of over 1000 subjects, and our theoretical
development, heterogeneity in gain-loss attitudes appears to be an issue of first order
importance precluding strong inferences from prior work. Mixed evidence on the KR model
is likely not driven by a failure of the expectations-based formulation of reference points,
6
Such lack of predictive power is intuitive given that subjects are endowed with different objects in
Stage 2 than in Stage 1 and there is random assignment to conditions.
7
We are indebted to an anonymous referee for suggesting the current path of structural analysis. This
analysis differs from our prior presentation of the results, as we have adopted a mixed logit methodology
for identifying gain-loss attitudes rather than our prior standard logit methods.
7
but rather by a failure of the second component of the joint hypothesis inherent to this
prior work: that gain-loss attitudes are universal. We show, in a simple and reproducible
way that the predictions of the expectations-based KR model are reliably recovered once
heterogeneity in gain-loss attitudes is accounted for.
Second, we add a key observation on the heterogeneity of gain-loss attitudes to a grow-
ing literature on the topic. Chapman et al. (2018) indicate eight prior studies with a
documented distribution of gain-loss attitudes, only one of which is measured outside of
lottery choice: a prior version of this paper. Ours are the first findings to document the
distribution of gain-loss attitudes in exchange settings and predictive validity of resulting
individual measures. In our exchange setting, we document an average attitude of loss
aversion, but a sizable proportion of the distribution, 38 percent, exhibits gain lovingness.
This proportion of gain-loving subjects somewhat exceeds estimates from risk experiments
in the lab, but falls below the field estimates of Chapman et al. (2018). Future work
providing further documentation and evaluation of the heterogeneity in gain-loss attitudes
across domains is equipped with an initial observation from exchange behavior.
The paper proceeds as follows. In Section 2, we set the theoretical background and de-
rive behavioral predictions. Section 3 and 4 present the experimental design and results,
respectively. Section 5 concludes.
2 Theoretical Considerations and Design Guidance
We examine the predictions of the KR model in simple exchange settings with two objects,
recognizing heterogeneity of gain-loss attitudes. The theoretical development hues closely
to our experimental design, providing motivation for our analysis. We contrast two con-
ditions: a standard exchange paradigm, termed Condition B, where subjects are endowed
with an object and decide whether to exchange or not; and a probabilistic forced exchange
paradigm, termed Condition F, identical to Condition B except with probability 0.5, re-
gardless of choice, exchange will be forced. We show that loss-averse subjects should grow
8
more willing to exchange in Condition F relative to Condition B. In contrast, gain-loving
subjects should grow less willing to exchange in Condition F relative to Condition B.
There is a central intuition for the heterogeneous response to probabilistic forced exchange.
When attempting not to exchange their endowment in Condition F, an individual faces
the potential of having this object taken from them and exchanged anyways. A loss-averse
individual disproportionately dislikes the sensation of potential loss and so may choose to
exchange to avoid the possible loss. In contrast, a gain-loving individual disproportionately
likes the sensation of potential gain and so may choose not to exchange to maintain the
possible gain.
Consider a two-dimensional utility function over two objects of interest, object
X
and
object
Y
. Let
c
= (
m
X
,m
Y
)
and
r
= (
r
X
,r
Y
)
represent vectors of intrinsic utility and
reference utility, respectively. The KR model specifies a utility function with two com-
ponents, intrinsic utility,
m
(
c
)
≡
m
X
+
m
Y
, and gain-loss utility,
n
(
c
|
r
)
≡
n
X
(
m
X
|
r
X
) +
n
Y
(
m
Y
|
r
Y
)
≡
μ
(
m
X
−
r
X
) +
μ
(
m
Y
−
r
Y
)
, with separability across consumption dimensions.
Let
m
X
∈{
0
,X
}
and
m
Y
∈{
0
,Y
}
stand for both the outcome and the corresponding in-
trinsic utility of owning zero or one unit of object X, and zero or one unit of object Y,
respectively. Overall utility is described by
u
(
c
|
r
) =
u
(
m
X
,m
Y
|
r
X
,r
Y
) =
m
X
+
n
X
(
m
X
|
r
X
) +
m
Y
+
n
Y
(
m
Y
|
r
Y
)
=
m
X
+
μ
(
m
X
−
r
X
) +
m
Y
+
μ
(
m
Y
−
r
Y
)
,
where
μ
(
z
) =
ηz
if
z
≥
0
ηλz
if
z <
0
.
In this piece-wise linear gain-loss function, the parameter
η
captures the magnitude of
changes relative to the reference point, and
λ
captures gain-loss attitudes. If
λ >
1
, the
individual is loss-averse, experiencing losses more than commensurately-sized gains. If
9
λ <
1
, the individual is gain-loving, experiencing gains more than commensurately-sized
losses.
2.1 Determination of the Reference Point in Exchange Behavior
In the KR model, unless exogenously determined, the vector
r
is established as part of a
consistent forward-looking plan for behavior. The KR model posits a reference-dependent
expected utility function
U
(
F
|
G
)
, taking as input a distribution
F
over consumption out-
comes,
c
, which are valued relative to a distribution
G
of reference points,
r
. That is
U
(
F
|
G
) =
∫ ∫
u
(
c
|
r
)
dF
(
c
)
dG
(
r
)
.
A
Personal Equilibrium
is a situation where, given that the decision-maker expects as
a referent some distribution
F
, they indeed prefer
F
as a consumption distribution over
all alternative consumption distributions,
F
′
. Ex-ante optimal behavior has to accord
with expectations of that behavior. Formally, given a choice set,
D
, of lotteries,
F
, over
consumption outcomes
c
= (
m
X
,m
Y
)
, KR’s
Personal Equilibrium
states the following:
Personal Equilibrium (PE):
A choice
F
∈D
, is a personal equilibrium if
U
(
F
|
F
)
≥
U
(
F
′
|
F
)
∀
F
′
∈D
.
Regardless of endowment, if object X is to be chosen in a PE, then
r
= (
X,
0)
and if object
Y is to be chosen in a PE then
r
= (0
,Y
)
.
Given the potential for the multiplicity of PE selections, the KR model is constructed
with a notion of equilibrium refinement,
Preferred Personal Equilibrium
(PPE), and an
alternate non-PE criterion,
Choice Acclimating Personal Equilibrium
(CPE). In both of
these constructs, ex-ante utility is used as a basis for selection and, hence, for making more
narrow predictions. For ease of explication, we focus our analysis on the CPE criterion.
In Appendix B we provide theoretical analyses under the PE and PPE approaches. Im-
10
portantly, all three formulations share common comparative statics, and therefore make
qualitatively similar predictions, for our KR test.
Given a choice set,
D
, of lotteries,
F
, over consumption outcomes
c
= (
m
X
,m
Y
)
,
Choice-
Acclimating Personal Equilibrium
states the following:
Choice Acclimating Personal Equilibrium (CPE):
A choice
F
∈D
, is a choice-acclimating
personal equilibrium if
U
(
F
|
F
)
≥
U
(
F
′
|
F
′
)
∀
F
′
∈D
.
Under CPE, an individual selects between options like
[
c
,
r
] = [(
X,
0)
,
(
X,
0)]
and
[
c
,
r
] =
[(0
,Y
)
,
(0
,Y
)]
.
8
2.1.1 Manipulating r: Probabilistic Forced Exchange
The CPE concept noted above requires consistency between the distributions of
c
and
r
. We consider a baseline simple exchange condition, Condition B, for an individual en-
dowed with object
X
. We focus on the choice set consisting of pure strategy choices
D
=
{
(
X,
0)
,
(0
,Y
)
}
, with the first element reflecting choosing not to exchange and the
second choosing to exchange.
9
In this setting, there are two potential CPE selections,
[
c
,
r
] = [(
X,
0)
,
(
X,
0)]
and
[
c
,
r
] =
[(0
,Y
)
,
(0
,Y
)]
. The individual can support not exchanging,
[
c
,
r
] = [(
X,
0)
,
(
X,
0)]
, in a
CPE if
U
(
X,
0
|
X,
0)
≥
U
(0
,Y
|
0
,Y
)
,
which, under our functional form assumptions, becomes
X
≥
Y.
(1)
8
Note that a selection need not be PE in order to be CPE. The alternate concept, PPE requires
F
and
F
′
to be PE, rather than simply elements of
D
.
9
In Appendix B, we conduct the analysis with
D
including all mixtures of exchanging and not ex-
changing and reach quite similar results.
11
Figure 1, Panel A graphs the Condition B CPE cutoff,
X
B
=
Y
, the smallest value of
X
at which the individual can support not exchanging, which is constant for all values of the
gain-loss parameter,
λ
.
The value
X
B
=
Y
implies that choice in Condition B is governed only by intrinsic utility.
This represents the inability of KR-CPE to rationalize the standard endowment effect. This
prediction is not shared by the PE formulation, wherein the value of gain-loss attitudes
tunes the set of permissible PE choices and can lead to an endowment effect (see Appendix
B). Nonetheless, the critical comparative static shared by both formulations is delivered
by comparing exchange behavior in this baseline Condition B with probabilistic forced
exchange.
0
1
2
3
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Gain−Loss Parameter,
λ
X
X
B
X
F
Loss Averse
Gain Loving
CPE No Exchange
in B & F
CPE No Exchange in B
CPE Exchange in F
CPE Exchange
in B & F
CPE Exchange in B
CPE No Exchange in F
Panel A: Theoretical CPE Thresholds
0
1
2
3
4
5
−0.2
−0.1
0.0
0.1
0.2
0.3
0.4
0.5
Gain−Loss Parameter,
λ
Treatment Effect
CPE−Logistic
CPE−Normal
Panel B: Predicted Treatment Effects
Figure 1:
Gain-Loss Attitudes, Theoretical Thresholds, and Treatment Effects
Notes:
Panel A: CPE cutoff values for agent endowed with
X
,
Y
= 1
and
η
= 1
. For
X
≥
X
B
=
Y
,
individuals can support not exchanging as a CPE in a baseline exchange environment (Condition B).
For
X
≥
X
F
=
1+0
.
5
η
(
λ
−
1)
1+0
.
5
η
(1
−
λ
)
Y
, individuals can support not exchanging as a CPE in a forced exchange
environment (Condition F). Panel B: Simulated treatment effects for the probability of exchange plotted
by
λ
with
Y/X
= 1
,
η
= 1
under logistic or normal probability distributions.
Now, consider an environment of probabilistic forced exchange, Condition F. With proba-
bility 0.5, the agent, assumed endowed with
X
, will be forced to exchange
X
for
Y
regard-
12
less of their choice. If the individual wishes to retain their object, they are subject to a
stochastic reference point, as with probability 0.5 their object will be exchanged. Now, the
potential selections for someone endowed with
X
are
D
=
{
0
.
5(
X,
0) + 0
.
5(0
,Y
)
,
(0
,Y
)
}
,
with the first element reflecting attempting not to exchange and the second reflecting
exchange, as before. They can support attempting not to exchange as a CPE if
U
(0
.
5(
X,
0) + 0
.
5(0
,Y
)
|
0
.
5(
X,
0) + 0
.
5(0
,Y
))
≥
U
(0
,Y
|
0
,Y
)
,
which, under our functional form assumptions, becomes
0
.
5
X
+ 0
.
5
Y
+ 0
.
25
η
(1
−
λ
)(
X
+
Y
)
≥
Y
X
≥
1 + 0
.
5
η
(
λ
−
1)
1 + 0
.
5
η
(1
−
λ
)
Y.
The manipulation of probabilistic forced exchange changes the CPE threshold from
X
B
=
Y
to
X
F
=
1+0
.
5
η
(
λ
−
1)
1+0
.
5
η
(1
−
λ
)
Y
. Figure 1, Panel A illustrates the changing CPE cutoff values
associated with not exchanging. In Condition F, the individual can support attempting to
retain
X
in CPE on the basis of both intrinsic utility and gain-loss attitudes.
The gain-loss parameter,
λ
, tunes precisely how behavior should change between Conditions
B and F. Figure 1, Panel A is partitioned into four regions. Two critical regions of changing
CPE choice are identified. For
X > Y
and
λ >
1
, it is CPE to not exchange in Condition B
and CPE to exchange in Condition F. This region has been the basis of prior experimental
tests under the assumption of universal loss aversion as such individuals become more
willing to exchange when probabilistically forced. Ignored to date is the region where
X < Y
and
λ <
1
. In this region, it is CPE to exchange in Condition B and CPE to
not exchange in Condition F. In contrast to the loss-averse prediction, such gain-loving
individuals become less willing to exchange when probabilistically forced.
13
2.2 Heterogeneity in Gain-Loss Attitudes and Theoretical Treat-
ment Effects
Manipulating probabilistic forced-exchange carries clear value for testing the KR model.
The altered thresholds for supporting exchange in CPE illustrated in Figure 1, Panel A
form the basis of experimental tests. Importantly, to date, these tests have been conducted
under the standard assumption of loss aversion,
λ >
1
. Figure 1, Panel A makes clear that
these thresholds are differentially altered for different values of
λ
. In this subsection, we
evaluate theoretical treatment effects when
λ >
1
or
λ <
1
. We demonstrate different
signed treatment effects around the value of
λ
= 1
, and the corresponding challenge of
aggregating treatment effects.
In order to make predictions on the behavioral response to probabilistic forced exchange,
we map from the CPE thresholds to the probability of making a specific selection. We
simulate behavior assuming that
X
and
Y
have equal intrinsic utility,
Y/X
= 1
,
η
= 1
,
and the CPE utilities are followed probabilistically subject to a specific logit choice model.
10
That is, an individual chooses to exchange in Condition B with probability
Prob
(
Exchange
)
B
=
Prob
(
Y > X
) =
Prob
(
Y/X
−
1
>
0)
=
logistic
(0) = 0
.
5
.
Similarly, the individual chooses to exchange in Condition F with probability
Prob
(
Exchange
)
F
=
Prob
(
Y >
0
.
5
X
+ 0
.
5
Y
+ 0
.
25
η
(1
−
λ
)(
X
+
Y
))
=
Prob
(0
.
5(
Y/X
−
1) + 0
.
25
η
(
λ
−
1)(1 +
Y/X
)
>
0)
=
logistic
(0
.
5(
λ
−
1))
.
10
In our actual empirical results, we estimate the value of
Y/X
rather than fix it by assumption. We
maintain
η
= 1
throughout.
14
And the treatment effect is simulated as
TE
=
Prob
(
Exchange
)
F
−
Prob
(
Exchange
)
B
=
logistic
(0
.
5(
λ
−
1))
−
logistic
(0)
Figure 1, Panel B graphs this treatment effect against the value of
λ
under the assumptions
η
= 1
and
X/Y
= 1
.
11
The theoretical simulated treatment effect is negative for
λ <
1
,
positive for
λ >
1
, and is generally concave in
λ
. Figure 1, Panel B also provides a
theoretical benchmark under a normal probability distribution rather than the logistic,
highlighting the robustness of the non-linearity prediction.
The apparent concavity of simulated treatment effects in
λ
implies a substantial challenge in
the aggregation of treatment effects. Not only do treatment effects change sign at
λ
= 1
, but
gain-loving subjects can have an outsized impact on identified average effects. Loss aversion
on average therefore does not guarantee positive aggregate treatment effects in exchange
experiments testing the KR model. Heterogeneity is a confound of first-order importance
plaguing prior experiments in this vein. Any test of KR must account for heterogeneity in
gain-loss attitudes to credibly test the underlying expectation-based reference-dependent
mechanism. Motivated by this point, our study combines the experimental manipulation
of probabilistic forced exchange with a prior measurement of gain-loss attitudes.
3 Experimental Design and Procedures
Our design is comprised of two stages. In Stage 1, a taxonomy of gain-loss types is created.
In Stage 2, subjects are assigned to either a standard exchange study or one with proba-
bilistic forced exchange. Stage 1 measures of gain-loss attitudes can then be connected to
Stage 2 behavior. Figure 2 illustrates the experimental order of events.
11
Appendix B provides the same analysis under PE and PPE.
15
Random endowment
(Object 1 or Object 2)
Stage 1
Stage 2
Subject’s choice:
Voluntary exchange
Randomized
exchange
(
p
=
0.5)
Condition:
Baseline
Condition:
Forced exchange
Object exchanged
if chose exchange
if chose no exchange
Object exchanged with
probability
p
= 0.5
Elicitation
of
gain
-
loss
attitude
s
Random endowment
(Object 3 or Object 4)
Subject’s choice:
Voluntary exchange
Object exchanged
if desired
Figure 2:
Timeline of Laboratory Experiment
Notes:
The figure displays the course of events in both treatment conditions, Condition B(aseline) and
Condition F(orced exchange).
3.1 Stage 1: Measuring Gain-Loss Attitudes
Procedures.
The experimenter welcomed the participants in a small presentation room
and informed them that the study would consist of two stages. At each seat was a card
with a number (placed face down). Then, without further explanation, the experimenter
projected on the wall two equally-sized pictures of the respective Stage 1 objects for that
session, along with the description and two short bullet points on the characteristics of
each product. The exact information presented to subjects is reproduced in Appendix E
in German and translated to English.
After allowing sufficient time (three minutes) to study the projected information, the ex-
perimenter asked subjects to turn the card in front of them over and move to the cubicle
with the corresponding number in the adjacent computer laboratory. In their private cu-
bicle, which was separated and not visible from the outside, subjects would find one of the
two presented objects. Computer instructions then informed the subject that they possess
the object in front of them, and that they are free to inspect it more closely.
After three minutes allotted for inspection of the object, we asked subjects three questions.
First, for each object subjects were asked “How much do you like this product?” with
response scales ranging from 0=“Not at all” to 8=“Very much”. Second, for each object
16
they were asked “How much would you want to have this product?” on the same response
scales. Third, they were asked “If you had to choose one of the objects, which one would
you prefer to keep?”, and were asked to provide a hypothetical choice between the two
objects. These three preference statements are the raw data upon which our structural
estimates of gain-loss attitudes are constructed. Given that subjects are endowed with
one of the objects, our structural estimates of gain-loss attitudes assume this exogenous
endowment is their reference point when providing these preference statements (see section
4.1.1 for further discussion).
This paper sidesteps several potential measurement issues within the KR model. First,
our Stage 1 design exogenously endows subjects with objects and elicits preference state-
ments under this fixed endowment without any discussion of actual exchange between the
objects or future exchange opportunities. In the case of an exogenous endowment that
cannot be expected to change, the KR model coincides exactly with the standard model
of reference-dependent preferences with a fixed reference point. This allows us to elicit
gain-loss attitudes under our exogenous endowment. See Kőszegi and Rabin (2006) for
additional discussion of this point in the particular context of the endowment effect. Had
we conducted an alternate design without such exogenous endowments or with salient
discussion of exchange, the reference point would plausibly not be fixed, challenging our
assumptions for measurement of gain-loss attitudes.
Second, there is a challenge associated with multiple choices and measurement within the
KR model. In principle, all experimental choices should be considered part of a subject’s
strategy, with the suite of experimental choices forming a consistent plan of action (for
related discussion, see Sprenger, 2015). Instead, we measure gain-loss attitudes in Stage
1 with non-choice response. Measuring gain-loss attitudes with choice would alter the
analysis considerably, as the subject may be assumed to take any Stage 1 decision un-
der a consistent forecast of their Stage 2 decision, and the choice set would consist of all
potential two-period plans. Measurement of gain-loss parameters using choice responses
would require attention to interactions across stages of the experiment and an accounting of
17
subsequent behavior in generating a taxonomy of types.
12
We assure that our non-choice
taxonomy has close concurrence with actual choice by drawing links between estimated
gain-loss attitudes in Stage 1 and subsequent choices made in Stage 2 for both condi-
tions. We further ensure that the observed correlations are not driven by functional form
assumptions by providing a companion reduced-form exercise.
Stage 1 of our experiment also featured one additional element of random variation: an ex-
perience with probabilistic exchange. After subjects provided their preference statements,
the computer instructions announced that the experimenter would randomly draw a num-
ber between 1 and 20 using a rotating lottery drum placed on a table in the middle of the
room. Half of the subjects were informed that the object in front of them would be replaced
by the alternative object if a number between 1 and 10 was drawn. Instructions for the
other half read that this replacement would only take place if a number between 11 and
20 was drawn.
13
The experimenter drew the number in a way that both the lotto device
containing the 20 balls and the drawn number was visible from every cubicle. Immediately
following the draw, and without further comment, the experimenter replaced objects as
dictated by the drawn number. As noted above, it is critical that introduction of random
replacement procedure was done
after
all preference statements were elicited under the
exogenously endowed object.
This random experience serves two purposes in our design. First, regardless of Stage 2
treatment assignment, individuals will have had some prior experience with probabilistic
exchange (albeit without choice). Second, it removes a potential challenge to our interpre-
tation associated with complementarities between objects across rounds. If there existed
some un-modeled, unintentional complementarity between the objects endowed in Stage
1 and Stage 2, a subject might state a preference for or against both of their endowed
objects in order to consume both endowed objects or both alternatives together. Random
12
This not to say that such exercises could not valuably add to future work in this area.
13
This
loss condition
was counterbalanced within each subsample endowed with the same object, such
that irrespective of the draw, exchange would take place for exactly half of the subjects initially endowed
with either object.
18
replacement within Stage 1 breaks these potential complementarities as a driver of Stage
2 choice, and we can explore the relationship between Stage 1 experience and Stage 2 be-
havior.
14
After completing Stage 1, the instructions asked subjects to return to the main
lecture room for Stage 2.
3.2 Stage 2: Probabilistic Forced Exchange, Heterogeneity, and
Prior Experience
Procedures.
The basic procedures in Stage 2 were deliberately kept exactly identical to
those in the Stage 1. Upon their return to the lecture room, the experimenter projected
another page onto the wall, this time presenting the objects for Stage 2 of that session.
In the meantime, a second experimenter allocated objects to the cubicles in the computer
laboratory next door in a pre-specified order. Subjects were ushered back to their cubicle
where they found their second endowed object and were allowed sufficient time for inspec-
tion. In Stage 2, subjects were randomized into one of two conditions: a baseline exchange
condition, Condition B, and a treatment condition with probabilistic forced exchange, Con-
dition F. The randomization was conducted at the session level.
15
Across our two studies,
59 percent (603 of 1024 subjects) were randomly assigned to Condition F sessions.
16
Condition B: Baseline Exchange.
In Condition B, subjects had an opportunity to
voluntarily exchange their endowed object for the alternative. Their decision was final —
14
Immediately before and immediately after the random replacement was conducted, we elicited sub-
jects’ mood using standard psychological scales (Bradley and Lang, 1994). Subjects answered the question
“Please answer the following questions about how you currently feel. Which expressions better apply to
you at the moment?” by positioning a slider on an 11-point response scale. The lower end (0) was labeled
using the words “Unhappy, Angry, Unsatisfied, Sad, Desperate” and the upper end (10) was labeled “Happy,
Thrilled, Satisfied, Content, Hopeful’". The changes in these values were used as an initial validation of
gain-loss types in prior versions of this manuscript. For space considerations we do not conduct this in-
termediate analysis here, but the results can be found in
https://papers.ssrn.com/sol3/papers.cfm?
abstractid=3170670
and
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3589906
.
15
We present our analysis with robust standard errors in the main text and Appendix Tables A4 through
A6 reproduce our results with standard errors clustered at the session level. Statistical significance is
enhanced with clustering, and so we opt to provide the more conservative values in the main text.
16
In our initial study 62 percent (374 of 607) were assigned to Condition F under an assignment prob-
ability of 60 percent, and in our replication study 55 percent (229 of 417) were assigned to Condition F
under an assignment probability of 50 percent.
19
whatever they chose they would receive. The baseline condition is a standard exchange
setting common to endowment effect experiments.
Condition F: Probabilistic Forced Exchange.
Condition F implemented an exchange
study with probabilistic forced exchange. The instructions specified that regardless of their
choice, exchange would take place with probability 0.5 based on a draw from the lotto drum,
as in Stage 1. This means that for a subject who decided to exchange, the treatment had
no effect. However, for a subject who attempted to keep their object, exchange would be
forced probabilistically with a 50 percent chance.
Several noted issues with experimental investigations of market exchange motivated our
purposefully simple design (Plott and Zeiler, 2005, 2007). First, subjects take a simple
binary choice, alleviating potential concerns related to the use of ‘multiple price lists’
in exchange experiments. Specifically, we do not need to elicit a willingness to pay or
willingness to accept in monetary terms, but simply ask whether the subject is willing to
trade the endowed object for the alternative. As such, mistaken perceptions of market
power do not play a role, nor do income effects. Second, unlike previous market exchange
experiments, we create a private environment that limits confounds from social interaction.
In particular, subjects make their decisions anonymously in a private cubicle; they find
their endowment placed in front of them when entering the cubicle instead of receiving it
personally through the hands of the experimenter, which has been criticized for triggering
the misperception of the endowment as a gift (see, e.g., Plott and Zeiler, 2005, 2007); and
subjects do not interact with other subjects at any stage during the experiment.
3.3 Sample Details
An initial sample of 607 students and a replication sample of a further 417 students from the
University of Bonn participated in the experiment which was conducted using the software
z-Tree (Fischbacher, 2007) in June and July 2015 and July 2018 at the BonnEconLab.
17
17
Several minor differences between the original sessions and those in the replication deserve note. We
opted to split the treatment assignments between Condition B and Condition F at 50 percent-50 percent
20
We conducted 53 sessions with 16 to 20 participants each. Table 1 provides an overview
of the subject pool by treatment conditions.
Table 1:
Summary Statistics and Treatment Assignment
Stage 1
Pair 1
Pair 2
USB stick Pen set Picnic mat Thermos
A) Initial Endowment
274
264
242
244
– in % of subject pool
26.76%
25.78% 23.63%
23.83%
Stage 2
Pair 1
Pair 2
USB stick Pen set Picnic mat Thermos
B) Initial Endowment
242
244
274
264
– in % of subject pool
23.63%
23.83% 26.76%
25.78%
C) Condition B
113
117
97
94
– in % of B)
46.70%
47.95% 35.40%
35.60%
D) Condition F
129
127
177
170
– in % of B)
53.30%
52.05% 64.60%
64.40%
Total number of observations 1024
Notes
: Stage 2 condition (Condition B or Condition F) is randomized within each session.
The use of each pair as the Stage 1 pair was counterbalanced at the session level.
The objects used for the exchange experiment included a USB stick, a set of three erasable
pens, a picnic mat, and a thermos.
18
We selected these four objects on the basis of a
pre-experimental survey evaluation of 12 candidate objects. We put particular emphasis
on ruling out complementarities between items across rounds. The former two (USB stick
and pens) and the latter two objects (picnic mat and thermos) each constituted a pair.
rather than the original 40 percent-60 percent to maximize power. Since storage technology rapidly
advanced, the 8GB USB stick had to be replaced by a 16GB USB stick, as that was the new minimum. In
addition, we were unable to repurchase the identical pattern for the picnic mat, so we opted for a visually
similar one. Further, while only one experimenter ran the sessions for the original study, a total of 4
experimenters ran sessions during the course of the replication. In the Appendix, we repeat the analysis
with experimenter fixed effects and find quantitatively similar results. Lastly, there was a small error in the
implementation of sessions run by one specific experimenter who reversed the coded, randomly selected,
endowments. Although this has no effect on the experiment, it did require us to recode the endowments
for these sessions. The results excluding this experimenter’s sessions also reproduce the findings here.
18
Pictures and information presented to subjects are reproduced in Appendix E.
21
Every subject faced exactly one stage with each pair of objects. The use of each pair as
the Stage 1 pair was counterbalanced at the session level, with the respective other pair
used in Stage 2. Within each session, the endowments of one of the two objects within the
pair was counterbalanced in both stages.
19
4 Experimental results
We present the results in three subsections. First, we examine the Stage 1 preference
statements leading to our taxonomy of gain-loss attitudes. Second, we examine behavior
in Stage 2, linking heterogeneity in gain-loss attitudes to the behavioral response to prob-
abilistic forced exchange. Third, we provide robustness tests and separate analyses for our
initial and replication studies.
4.1 Stage 1: Identifying Gain-Loss Attitudes
In Stage 1, we collect three critical preference statements for the endowed and alternative
object. This is used to infer the gain-loss attitude for each individual. Figure 3 provides
histograms for our three preference statements: hypothetical choice, and wanting and liking
ratings for the two objects. We summarize the direction of preference in ratings statements
using the ordinal information of rating the endowed object higher than the alternative,
giving them equal rating, or rating the alternative higher. These values are aggregated
across the four potential endowments of Stage 1. Given random assignment of endowed
objects and the counterbalanced design, the distributions of preference statements should
be identical between endowed and alternative objects. Instead, all three distributions show
a clear preference for the subject’s endowed object relative to the alternative. Fifty-seven
percent of subjects state that they would choose their endowed object, 45 percent provide
19
That is, if for a given session the USB stick and pens pair constituted the Stage 1 pair, the picnic mat
and thermos pair would be the Stage 2 pair. Half of the subjects were initially endowed with the USB
stick in Stage 1. Among this half of the session participants, again half would initially receive the picnic
mat and the other half the thermos at the beginning of Stage 2.
22