Neural Correlates of the Divergence of Instrumental Probability Distributions
Flexible action selection requires knowledge about how alternative actions impact the environment: a "cognitive map" of instrumental contingencies. Reinforcement learning theories formalize this map as a set of stochastic relationships between actions and states, such that for any given action considered in a current state, a probability distribution is specified over possible outcome states. Here, we show that activity in the human inferior parietal lobule correlates with the divergence of such outcome distributions–a measure that reflects whether discrimination between alternative actions increases the controllability of the future–and, further, that this effect is dissociable from those of other information theoretic and motivational variables, such as outcome entropy, action values, and outcome utilities. Our results suggest that, although ultimately combined with reward estimates to generate action values, outcome probability distributions associated with alternative actions may be contrasted independently of valence computations, to narrow the scope of the action selection problem.
Additional Information© 2013 the authors. Received March 30, 2013; revised June 14, 2013; accepted June 18, 2013. This work was funded by a National Institutes of Health Grant (DA033077-01) to J.P.O. The authors thank Daniel McNamee for helpful discussions. Author contributions: M.L. and J.P.O. designed research; M.L., S.W., and J.Z. performed research; M.L. analyzed data; M.L. and J.P.O. wrote the paper.
Published - 12519.full.pdf