High-resolution characterization of human leukocyte antigen genes in diverse populations to study the genetics of food allergy - Food allergies are common, costly, and potentially life-threatening. Epidemiologic studies suggest prevalence is growing, particularly among minority populations. Twin studies estimate that food allergies are highly heritable (>80%), underscoring the importance of genetic causes. To date, there has been only a few genome-wide association studies of food allergy and none with African Americans or Latinos in the discovery set. Moreover, the phenotypes used in these studies have been inconsistent. Despite these issues, there have been some consistent findings, such as repeated associations with human leukocyte antigen (HLA) genes. This is not altogether surprising given the role that HLA proteins play in presenting antigens to effectors cells, resulting in either tolerance or sensitization. However, our ability to identify causal variants in HLA genes is hampered by the structural complexity of the major histocompatibility complex (MHC) region (i.e., an exceptionally high degree of polymorphism, numerous pseudogenes, and long-range linkage disequilibrium). In this application, we take a multifaceted approach to better understanding food sensitization, food allergy disparities, and the underlying risk factors. We have assembled three large, diverse study cohorts (combined n=12,882): the Study of Asthma Phenotypes and Pharmacogenomic Interactions by Race-Ethnicity (SAPPHIRE); the Study of African Americans, Asthma, Genes & Environment (SAGE); and the Genes- Environments & Admixture in Latino Americans (GALA II). These multi-ethnic cohorts have a wealth of existing socio-environmental exposure data and high-depth short-read whole-genome sequencing (WGS). In Aim 1, we will utilize IgE arrays to broadly characterize allergic sensitization for 94 different foods and 77 food allergen components. These data will allow us to identify differences in food sensitization between population groups (i.e., African Americans, Latinos [Mexicans and Puerto Ricans], and European Americans) and to assess the relationship between socio-environmental exposures (e.g., air pollution, tobacco smoke, neighborhood characteristics, and perinatal events) and food allergen sensitization. In Aim 2, we will leverage the large and diverse study population, existing WGS, and IgE sensitization data from Aim 1 to investigate for genetic variants associated with any food sensitization and sensitization to specific common food allergens (e.g., peanut, seafood, tree nut, dairy, and egg). Associations will be replicated in a large pediatric cohort from the Children’s Hospital of Philadelphia. To overcome the aforementioned challenges of the MHC region, in Aim 3 we propose a novel approach, which exploits the benefits of short-read DNA sequencing (high fidelity) and recent advances in ultra-long-read DNA sequencing. The resulting de novo, high-resolution assemblies of the MHC region will be used to look for HLA variants associated with food sensitization in a large sample of African American participants from SAPPHIRE (n=1,860) and SAGE (n=849). These data will provide an unprecedented look at the relationship between HLA variation and both seafood and peanut sensitization.