How are humans able to quickly and accurately discern the identities of people, places and things from
just a brief glance at a complex visual scene? Despite decades of behavioral and brain-based research
we lack an understanding the neural computations that enable us to see. But the recent success of
convolutional neural networks (CNNs) in attaining near human-level performance on a variety of visual tasks
has transformed the study of visual recognition in primates. For the first time we have computationally precise
image-computable models of how visual recognition might work in the brain. Further, considerable evidence
indicates notable similarities between CNNs optimized for visual classification performance, and primate visual
systems shaped by evolution and learning to accomplish similar tasks. These game-changing advances have
opened up exciting new avenues for answering longstanding questions about the brain basis of vision. The
proposed work harnesses CNNs to tackle classic questions about visual recognition in humans.
Aim I uses CNN-based models of specific cortical regions (such as the fusiform face area or FFA) to
ask what exactly is computed and represented in each voxel and region of the ventral visual pathway. Building
upon our recent work that has developed CNN-based models that predict the response of the FFA and other
regions to novel stimuli with high accuracy, the present work will test these models on a broader range of
stimuli to identify their boundary conditions, develop better models, and test them with new experiments. The
same methods will then be used to discover and build CNN-based encoding models of less understood regions
of the ventral visual pathway. Aim II applies CNN-based models methods to test classic theories of visual
recognition for the first time with computational and neural precision. Aim III uses CNNs to test computational
accounts of why the ventral visual pathway organized the way it is.
Accurate computational models of visual processing will be foundational for understanding human
cognitive functions that build on vision (especially reading, numerosity, memory, attention, and decision
making), as well as functions that are variously disrupted in psychiatric, neurodegenerative, and
neurodevelopmental disorders. More directly, many visual recognition tasks are critical to quality of life (e.g.
face detection, letter discrimination, food discrimination, door detection, etc.), and this work will bring us closer
to artificial brain interfaces to restore those functions in patients with vision loss.