Model based control can give rise to devaluation insensitive choice

Garrett, Neil ORCID: https://orcid.org/0000-0003-1440-472X, Allen, Sean and Daw, Nathaniel D. (2022) Model based control can give rise to devaluation insensitive choice.

Full text not available from this repository. (Request a copy)

Abstract

Influential recent work aims to ground psychiatric dysfunction in the brain’s basic computational mechanisms. For instance, compulsive symptoms as in drug abuse have been argued to arise from imbalance between multiple systems for instrumental learning. Computational models suggest that such multiplicity arises because the brain adaptively simplifies laborious “model- based” deliberation by sometimes relying on a cheaper, more habitual “model-free” shortcut. Support for this account comes in part from failures to appropriately change behavior in light of new events. Notably, instrumental responding can, in some circumstances, persist despite reinforcer devaluation, perhaps reflecting control by model-free mechanisms that are driven by past reinforcement rather than knowledge of the (now devalued) outcome. However, another important line of theory – heretofore mostly studied in Pavlovian conditioning – posits a different mechanism that can also modulate behavioral change. It concerns how animals identify different rules or contingencies that may apply in different circumstances, by covertly clustering experiences into distinct groups identified with different “latent causes” or contexts. Such clustering has been used to explain the return of Pavlovian responding following extinction. Here we combine both lines of theory to investigate the consequences of latent cause inference on instrumental sensitivity to reinforcer devaluation. We show that because segregating events into different latent clusters prevents generalization between them, instrumental insensitivity to reinforcer devaluation can arise in this theory even using only model-based planning, and does not require or imply any habitual, model-free component. In simulations, these ersatz habits (like laboratory ones) emerge after overtraining, interact with contextual cues, and show preserved sensitivity to reinforcer devaluation on a separate consumption test, a standard control. While these results do not rule out a contribution of model-free learning per se, they point to a subtle and important role of state inference in instrumental learning and highlight the need for caution in using reinforcer devaluation procedures to rule in (or out) the contribution of different learning mechanisms. They also offer a new perspective on the neurocomputational substrates of drug abuse and the relevance of laboratory reinforcer devaluation procedures to this phenomenon.

Item Type: Article
Uncontrolled Keywords: sdg 3 - good health and well-being ,/dk/atira/pure/sustainabledevelopmentgoals/good_health_and_well_being
Faculty \ School: Faculty of Social Sciences > School of Psychology
UEA Research Groups: Faculty of Social Sciences > Research Groups > Cognition, Action and Perception
Faculty of Social Sciences > Research Groups > Social Cognition Research Group
Depositing User: LivePure Connector
Date Deposited: 16 Sep 2022 13:37
Last Modified: 19 Sep 2022 05:51
URI: https://ueaeprints.uea.ac.uk/id/eprint/88404
DOI: 10.1101/2022.08.21.504635

Actions (login required)

View Item View Item