PROJECT SUMMARY
Animals can exhibit goal-directed behaviors in novel environments, despite limited experience
with them. How does the brain make and use inferences about the underlying statistics and
generative structure of environments to guide behavior? The field of reinforcement learning refers
to this capacity as “model-based” reasoning, meaning that it relies on an internal model of the
structure of the world. Critically, this internal model can be used to flexibly estimate the best
actions by mental simulation or planning, without direct experience. In contrast, in “model-free”
reinforcement learning, an agent chooses the best action based on direct experience, without
explicit knowledge of the underlying sequential transition structure of a task or environment.
Model-based and model-free mechanisms coexist in the brain and are mediated by distinct
circuits, although the neural circuit mechanisms by which the brain arbitrates between these
decision systems remains unknown. Theoretical and behavioral studies suggest that human
brains use the system that yields value estimates with the lowest uncertainty. The lateral
orbitofrontal cortex (lOFC) is a compelling candidate to perform arbitration because while it is
implicated in model-based reasoning, for instance by enabling inferences about hidden task
states, it lies upstream of the dorsal striatum, which is critical for both model-based and model-
free decision making. Intriguingly, we have found that lOFC neurons project exclusively to the
dorsolateral striatum (DLS), a region critical for model-free behavior, and not the dorsomedial
striatum (DMS), which is critical for model-based behavior. We hypothesize that projection
specific neural circuits in lOFC arbitrate between these systems by suppressing the model-free
system.
I will use state-of-the-art viral, electrophysiological, and computational methods to
determine whether DLS-projecting lOFC neurons mediate uncertainty-based arbitration between
decision-making systems (Aim 1) and characterize the underlying circuit logic that supports
arbitration (Aim 2). By optogenetically tagging DLS-projecting lOFC neurons I will selectively
characterize and perturb their activity while monitoring the behavioral strategy rats use in a task
with latent structure. To determine how arbitration is instantiated in the dorsal striatum I will
optogenetically activate OFC¿DLS neurons while recording from different genetic cell types in
the striatum, in vivo and in vitro. We predict that OFC¿DLS neurons enable model-based
behavior by activating inhibitory interneurons to suppress the DLS and the model-free system.