Project Summary/Abstract
The neurotransmitter dopamine (DA) is thought to play a central role in reward-based learning. The leading
theory posits that DA release acts as a reward prediction error (RPE) which incrementally updates the brain’s
predictions about future rewards. Recently, however, this hypothesis has come under attack, with two distinct
alternatives suggested related to learning rate and retrospective inference. However, these models make
similar predictions for patterns of DA release in standard classical conditioning tasks, making them difficult to
separate. Additionally, these studies, as well as some supporting RPE, suffer from several caveats: (1)
Rewards generate movements, which confound the interpretation of neural signals related to learning; (2)
Rewards activate many learning systems in parallel, not just the DA system, limiting the ability to attribute
learning to DA itself; (3) DA neurons have diverse functions that depend on their projection target, but prior
studies often mixed these diverse populations when recording or stimulating DA neurons. Thus, the
algorithm(s) by which DA drives reward learning and how this may be implemented in neural circuits remain
unknown. The central idea of this proposal is to use artificial conditioning tasks in which natural rewards have
been replaced with calibrated optical stimulation of dopamine axons (cDAS) in specific striatal subregions in
head-fixed mice. By design, this approach (1) limits movements, (2) isolates the effect DA release itself, and
(3) targets a projection-specific population of DA neurons, thus limiting caveats that hindered prior studies. Aim
1 uses this approach to identify the algorithm of DA-driven learning within the lateral nucleus accumbens
(lNAc), a site with concentrated signatures of RPE in DA release. Artificial conditioning tasks were designed to
arbitrate between RPE and alternative models. Preliminary data suggest that cDAS in lNAc generates changes
in DA activity that are consistent with RPE but not alternatives. In Aim 2, cell type-specific electrophysiological
recording and optical stimulation in lNAc will be used to answer how DA release alters striatal activity to drive
RPE learning. Aim 3 expands these studies to the dorsal striatum (DS), where DA release is thought to shape
and reinforce movements during addiction and other forms of habit formation. Multisite cDAS and projection-
specific optotagging of DA neurons will be combined with cutting-edge video processing techniques to test the
hypothesis that DA-driven learning spreads from lNAc to DS to shape movements (an “actor-critic” model).
Together, these reductionist studies will enable the algorithmic and neural circuit basis of DA-driven learning in
the striatum to be dissected with unprecedented precision. The proposed research will be conducted in the
Uchida Lab at Harvard, an excellent environment with all the necessary resources at hand. The candidate has
assembled an expert advisory committee and has made detailed plans to acquire the additional technical and
professional skills needed to complete the proposed project and launch his successful transition to an
independent research career.