CaltechAUTHORS
  A Caltech Library Service

Distinct prediction errors in mesostriatal circuits of the human brain mediate learning about the values of both states and actions: evidence from high-resolution fMRI

Colas, Jaron T. and Pauli, Wolfgang M. and Larsen, Tobias and Tyszka, J. Michael and O’Doherty, John P. (2017) Distinct prediction errors in mesostriatal circuits of the human brain mediate learning about the values of both states and actions: evidence from high-resolution fMRI. PLOS Computational Biology, 13 (10). Art. No. e1005810. ISSN 1553-7358. PMCID PMC5673235. https://resolver.caltech.edu/CaltechAUTHORS:20171023-100210571

[img] PDF - Published Version
Creative Commons Attribution.

4Mb
[img] Archive (ZIP) (Fig. S1-S6) - Supplemental Material
Creative Commons Attribution.

2470Kb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20171023-100210571

Abstract

Prediction-error signals consistent with formal models of “reinforcement learning” (RL) have repeatedly been found within dopaminergic nuclei of the midbrain and dopaminoceptive areas of the striatum. However, the precise form of the RL algorithms implemented in the human brain is not yet well determined. Here, we created a novel paradigm optimized to dissociate the subtypes of reward-prediction errors that function as the key computational signatures of two distinct classes of RL models—namely, “actor/critic” models and action-value-learning models (e.g., the Q-learning model). The state-value-prediction error (SVPE), which is independent of actions, is a hallmark of the actor/critic architecture, whereas the action-value-prediction error (AVPE) is the distinguishing feature of action-value-learning algorithms. To test for the presence of these prediction-error signals in the brain, we scanned human participants with a high-resolution functional magnetic-resonance imaging (fMRI) protocol optimized to enable measurement of neural activity in the dopaminergic midbrain as well as the striatal areas to which it projects. In keeping with the actor/critic model, the SVPE signal was detected in the substantia nigra. The SVPE was also clearly present in both the ventral striatum and the dorsal striatum. However, alongside these purely state-value-based computations we also found evidence for AVPE signals throughout the striatum. These high-resolution fMRI findings suggest that model-free aspects of reward learning in humans can be explained algorithmically with RL in terms of an actor/critic mechanism operating in parallel with a system for more direct action-value learning.


Item Type:Article
Related URLs:
URLURL TypeDescription
https://doi.org/10.1371/journal.pcbi.1005810DOIArticle
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5673235PubMed CentralArticle
https://doi.org/10.1371/journal.pcbi.1005810.s001DOIFig. S1
https://doi.org/10.1371/journal.pcbi.1005810.s002DOIFig. S2
https://doi.org/10.1371/journal.pcbi.1005810.s003DOIFig. S3
https://doi.org/10.1371/journal.pcbi.1005810.s004DOIFig. S4
https://doi.org/10.1371/journal.pcbi.1005810.s005DOIFig. S5
https://doi.org/10.1371/journal.pcbi.1005810.s006DOIFig. S6
ORCID:
AuthorORCID
Tyszka, J. Michael0000-0001-9342-9014
Additional Information:© 2017 Colas et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Received: June 12, 2017; Accepted: October 9, 2017; Published: October 19, 2017. Data Availability Statement: Data are available at https://neurovault.org/collections/ETRQWPUH/. This study was funded by National Institutes of Health (https://www.nih.gov/) grants R01DA033077 (supported by OppNet, NIH's Basic Behavioral and Social Science Opportunity Network) and R01DA040011 to JPOD as well as by the National Science Foundation (https://www.nsf.gov/) Graduate Research Fellowship Program for JTC. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The authors have declared that no competing interests exist. Author Contributions: Conceptualization: Jaron T. Colas, Tobias Larsen, John P. O'Doherty. Formal analysis: Jaron T. Colas. Funding acquisition: John P. O'Doherty. Investigation: Jaron T. Colas, Tobias Larsen. Methodology: Jaron T. Colas. Resources: Wolfgang M. Pauli, J. Michael Tyszka. Supervision: John P. O'Doherty. Visualization: Jaron T. Colas. Writing ± original draft: Jaron T. Colas, John P. O'Doherty. Writing ± review & editing: Jaron T. Colas, Wolfgang M. Pauli, J. Michael Tyszka, John P. O'Doherty.
Funders:
Funding AgencyGrant Number
NIHR01DA033077
NIHR01DA040011
NSF Graduate Research FellowshipUNSPECIFIED
Issue or Number:10
PubMed Central ID:PMC5673235
Record Number:CaltechAUTHORS:20171023-100210571
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20171023-100210571
Official Citation:Colas JT, Pauli WM, Larsen T, Tyszka JM, O'Doherty JP (2017) Distinct prediction errors in mesostriatal circuits of the human brain mediate learning about the values of both states and actions: evidence from high-resolution fMRI. PLoS Comput Biol 13(10): e1005810. https://doi.org/10.1371/journal.pcbi.1005810
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:82572
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:24 Oct 2017 19:48
Last Modified:03 Oct 2019 18:56

Repository Staff Only: item control page