Abstract
Visually discriminating and identifying materials (such as judging whether a cup is made of plastic or glass)
is crucial for everyday tasks, such as walking on different surfaces, using tools, and selecting food; and yet
material perception remains poorly understood. The main challenge is that a given material can take an enormous
variety of appearances depending on the 3D shape, lighting, and object class, and humans must untangle these
to achieve perceptual constancy. Previous research revealed useful image cues and found that 3D geometry
interacts with the material perception in intricate ways. The discovered image cues, however, do not generalize
across materials and scenes. The proposed work will combine unsupervised generative models with human
psychophysics to identify a representation that can disentangle physical properties and discover diagnostic image
features without labeled image data. The specific Aim 1 is to identify a latent representation that predicts human
material discrimination, using unsupervised deep neural networks trained with computer rendered images. The
specific Aim 2 is to characterize high-level semantic material perception, the effects of high-level recognition as
well as individual differences on attribute rating and recognition tasks. To discover a representation of real-world
materials, the PI and the team will train a unsupervised style-based Generative Adversarial Network (StyleGAN)
on real-world photographs. The preliminary results show that StyleGAN can generate realistic and diverse images
of materials. Collectively, these studies will explore how the semantic-level material perception process relates
to the statistical structure of the natural environment learned from unsupervised models. The proposed work will
also uncover the task-dependent interplay between high-level vision and mid-level representations, and provide
guidance for seeking neural correlates of material perception. The methods developed in this proposal, such
as discovering perceptual dimensions with limited human labeled data and characterizing individual variability,
have impact for other research in cognition. The AREA proposal provides a unique multidisciplinary training
opportunity to engage diverse undergraduate students at American University in the research of psychophysics,
machine learning, and image processing. The PI and students will also investigate a novel method of recruiting
under-represented human subjects using "peer-recruiting." Finally, the expected findings of this proposal will have
implications for the long-standing debate about the degree to which perceptual representations are predetermined
by evolution or learned via experience.