Decision-making for goal-directed actions via reinforcement learning (RL) is a fundamental component of
complex behaviors. Central to RL theory is the balance between exploration and exploitation, which enables
agents to interpret the environment using trial and error to learn an optimal strategy for maximizing reward.
Determining the optimal parameters for when to switch between exploration/exploitation states in RL models has
been difficult, and thus requires new biological insights. Recent work from our lab implicates locus coeruleus
norepinephrine release (LC-NE) in signaling exploration and exploitation states. LC-NE neurons exhibit phasic
activity in an RL task when presented with uncertain stimulus evidence to facilitate task execution/exploration,
and after receiving a surprising reinforcement to facilitate task optimization/exploitation on the next trial. How
these different phasic LC-NE signals are integrated in target regions to modulate different aspects of behavior is
unknown. One possibility is through spatiotemporal integration by astrocytes, which are highly responsive to NE,
are known to be involved in learning and memory, and can modulate neuronal activity on within-trial and
between-trial timescales. Here, we propose that LC-NE release during an RL task causes changes in cortical
network dynamics, facilitated through astrocyte signaling, that enable task execution and optimization. We will
examine the effects of LC-NE and astrocytes on neuronal population dynamics and RL using innovative
approaches combining dual 2-photon imaging of astrocytes and neurons in frontal/prefrontal cortex, high density
neural recordings, optogenetic and chemogenetic manipulation of neurons and astrocytes, and computational
approaches to define the effects of LC-NE and astrocytes on neuronal populations and task encoding. Finally,
we will develop biologically informed computational models of astrocyte-neuron interactions during learned
behavior. In Aim 1, we will record cortical astrocytes and neurons in mice performing our RL task. We will use
high density single-unit recordings and population analyses to determine how population dynamics evolve during
different task epochs. Using this information, we will determine how silencing LC-NE affects astrocyte and neuron
computations and dynamics during RL. In Aim 2, we will use chemogenetic and optogenetic manipulations of
astrocyte calcium to determine how astrocyte dynamics contribute to RL behaviors, and how this activity affects
neuronal population dynamics. In Aim 3, we will examine the hypothesis that extending RL algorithms via NE-
astrocyte signals can explain exploration at low stimulus evidence, and that NE-astrocyte interactions across
trials would be reflected in policy gradient learning rules to promote exploitation. Finally, we will determine
whether incorporating NE-astrocyte-neuron interactions into a recurrent neural network model can provide a rich
model for behavior and identify circuit motifs critical to our observed behavioral outcomes. These data will provide
an unprecedented view of the role of NE and astrocytes in a crucial behavioral function, and point to ways by
which their dysfunction can be ameliorated in brain disorders and diseases.