A major challenge in genetics is to identify genetic variants driving phenotypic variation, and the analysis of
natural variation in human protein sequences is an important avenue to meet the challenge. As weak
noncovalent interactions play critical role in protein folding, assembly and recognition, structural analysis of
how noncovalent interactions in human proteins, from one individual to another, vary will be critical. Our results
suggest that thousands of noncovalent interactions, particularly weak ones (e.g., p-interactions,
anion-quadrupole (AQ), hydrogen bonds etc.), are perturbed in human proteins due to natural variation. Like
other p-interactions, AQ also plays important role in macromolecular structure. However, the strength of weak
interactions (e.g., AQ) remain poorly understood, and therefore, the interpretation of the consequences of
natural variation of human protein sequences remain incomprehensible. The absence of the knowledge of
weak-interaction energetics and the comprehensive map of all weak interactions altered in human proteins will
continue to significantly contribute to the lack of understanding of the origin of phenotypic variation. Continued
existence of this knowledge gap represents an important problem because, until it is filled, how genetic
variants drive phenotypic variation remain incomprehensible for beneficial genetic interventions. Our long-term
goal is to better understand the role of weak noncovalent interactions in regulating protein function. The
objective for this particular R15 application is to comprehensively measure the strength of a weak interaction
(e.g., AQ) and to create a comprehensive catalogue of all noncovalent interaction in human protein structures
that are altered due to natural human sequence variation. Our rationale is that (a) determination of the energy
of a weak interaction (e.g., AQ) is likely to provide new insights by enabling subsequent studies on protein
function by manipulating AQ; (b) the availability of a complete catalogue of fine structural details of natural
missense variants of all human proteins will facilitate probing molecular mechanism of genetic variants driving
phenotypic variation. The two specific aims are: 1) Determine the strength of AQ experimentally; 2) Create a
database of 3D structural maps of natural missense variants of human proteins. For Aim-1, using 18 carefully
chosen protein-peptide interfaces, we experimentally measure the strength of AQ that occur in various
structural contexts for a comprehensive estimate. For Aim-2, using human exome aggregation consortium
(ExAC) database, we provide a comprehensive, fine structural map of all noncovalent interactions in human
protein structures that are perturbed due to natural variation. Using molecular dynamic simulations, we also
probe the consequences of ExAC mutations in two functionally important human proteins. The approach is
innovative for capturing a link between genetic and phenotypic variations at atomic resolution. The research is
significant, because it is expected to vertically expand the understanding of how genetic variations contribute to
phenotypic variation. That will enable preventative, therapeutic manipulations of human proteome.