Dissecting the algorithmic and neural circuit basis of dopamine-driven learning in the striatum - Project Summary/Abstract The neurotransmitter dopamine (DA) is thought to play a central role in reward-based learning. The leading theory posits that DA release acts as a reward prediction error (RPE) which incrementally updates the brain’s predictions about future rewards. Recently, however, this hypothesis has come under attack, with two distinct alternatives suggested related to learning rate and retrospective inference. However, these models make similar predictions for patterns of DA release in standard classical conditioning tasks, making them difficult to separate. Additionally, these studies, as well as some supporting RPE, suffer from several caveats: (1) Rewards generate movements, which confound the interpretation of neural signals related to learning; (2) Rewards activate many learning systems in parallel, not just the DA system, limiting the ability to attribute learning to DA itself; (3) DA neurons have diverse functions that depend on their projection target, but prior studies often mixed these diverse populations when recording or stimulating DA neurons. Thus, the algorithm(s) by which DA drives reward learning and how this may be implemented in neural circuits remain unknown. The central idea of this proposal is to use artificial conditioning tasks in which natural rewards have been replaced with calibrated optical stimulation of dopamine axons (cDAS) in specific striatal subregions in head-fixed mice. By design, this approach (1) limits movements, (2) isolates the effect DA release itself, and (3) targets a projection-specific population of DA neurons, thus limiting caveats that hindered prior studies. Aim 1 uses this approach to identify the algorithm of DA-driven learning within the lateral nucleus accumbens (lNAc), a site with concentrated signatures of RPE in DA release. Artificial conditioning tasks were designed to arbitrate between RPE and alternative models. Preliminary data suggest that cDAS in lNAc generates changes in DA activity that are consistent with RPE but not alternatives. In Aim 2, cell type-specific electrophysiological recording and optical stimulation in lNAc will be used to answer how DA release alters striatal activity to drive RPE learning. Aim 3 expands these studies to the dorsal striatum (DS), where DA release is thought to shape and reinforce movements during addiction and other forms of habit formation. Multisite cDAS and projection- specific optotagging of DA neurons will be combined with cutting-edge video processing techniques to test the hypothesis that DA-driven learning spreads from lNAc to DS to shape movements (an “actor-critic” model). Together, these reductionist studies will enable the algorithmic and neural circuit basis of DA-driven learning in the striatum to be dissected with unprecedented precision. The proposed research will be conducted in the Uchida Lab at Harvard, an excellent environment with all the necessary resources at hand. The candidate has assembled an expert advisory committee and has made detailed plans to acquire the additional technical and professional skills needed to complete the proposed project and launch his successful transition to an independent research career.