Search for a command to run...
Abstract Background Pavlovian responding is a core component of behavior and can be measured via Pavlovian-instrumental transfer (PIT), where Pavlovian responses bias instrumental actions. Standard single-lever PIT paradigms, which assess responses using a single-choice option, cannot dissociate the contribution of model-free versus model-based reinforcement learning. While indirect evidence suggests a role for model-free responding in single-lever PIT, the contribution of model-based strategies is unclear. It also remains unknown whether internal cognitive states, such as mind wandering, impair specifically model-based but not model-free PIT, as is theoretically expected. Methods We developed a novel, trial-by-trial two-stage PIT paradigm designed to computationally dissociate model-free and model-based Pavlovian responding by leveraging probabilistic state transitions and trial-wise outcome predictions. After each two-stage Pavlovian learning trial, participants performed a single-lever PIT trial as well as a query trial of explicit value judgment. Detailed task instructions were provided to support potential model-based strategies. Computational modeling was used to quantify individual learning strategies. We assessed mind-wandering questionnaires and thought probes. Results Analysis of query and PIT trials revealed trial-by-trial updating of outcome expectations based on probabilistic task structure, consistent with model-based Pavlovian responding. Behavioral responses during PIT were best explained by a computational model-based reinforcement learning model. In contrast, we found little evidence for model-free Pavlovian responding. Higher levels of mind wandering were associated with reduced model-based control but did not impact model-free indices. Conclusion We introduce a novel single-lever PIT paradigm that enables fine-grained dissociation of model-free versus model-based Pavlovian response systems. Our findings provide evidence that single-lever PIT can operate through model-based mechanisms, challenging the assumption that single-lever PIT is predominantly model-free. Our findings also indicate that internal attentional states selectively modulate model-based PIT. Given the involvement of Pavlovian responding in numerous psychiatric conditions, our paradigm offers new avenues for understanding maladaptive behavior. Author Summary Our daily actions are often influenced by cues like the smell of food or the sound of phone notifications that signal potential rewards or losses. These Pavlovian cues can shape our instrumental behavior even though their outcomes do not depend on what we do - a process known as Pavlovian-instrumental transfer (PIT). Here we study the computational learning mechanisms that underlie such PIT effects. While it is often assumed that Pavlovian responding follows simple, automatic rules without a cognitive model of cue consequences (i.e., model-free), evidence also shows a role for cognitive anticipations in Pavlovian responding (i.e., model-based). In this study, we extend this evidence by showing that PIT responding can be driven by flexible model-based learning. We designed a task to test whether participants use model-free versus model-based strategies to guide PIT, providing detailed task instructions. Using reinforcement learning models, we found that most participants used model-based learning when forming cue-outcome associations. Importantly, people’s attention mattered: when they were more distracted and doing mind wandering, they relied less on model-based strategies. Our findings suggest that Pavlovian learning is complex, flexible, and influenced by internal mental states, opening new windows to understand decision-making problems in mental health conditions like addiction.