Abstract
Orofacial clefts (OFCs) of the lip and/or palate are a prevalent congenital malformation with a complex genetic
etiology driven by both common and rare genetic variants. OFCs are comprised of three major subtypes: cleft
lip alone (CL), cleft lip with cleft palate (CLP) and cleft palate alone (CP) with genetic studies indicating both
shared and unique factors contributing to each subtype. There has been remarkable success in discovering
genetic loci associated with OFCs using genome wide association studies (GWAS); however, the relatively weak
contribution of each individual locus toward overall disease liability has limited efforts to quantify an individual’s
genetic risk for OFC. Over the past decade, novel methods have been developed to provide better measures of
genetic liability for complex disorders by aggregating many subtle common genetic effects into a single,
polygenic risk score (PRS). Application of a PRS to OFC cases would greatly aid in defining the heritable basis
of many more cases, but two fundamental challenges have limited its current use: 1) the majority of OFC data
has come from diverse populations, which confounds traditional PRS approaches; 2) assessments of PRS are
typically performed on case/control study designs and aren’t optimized for the familial data found in most OFC
studies. In this study we will perform innovative statistical techniques to overcome these previous limitations in
PRS generation and explore OFC genetic susceptibility in a large OFC cohort (n = 24,195; 7,896 cases)
comprised of 5 distinct ethnic groups (African, Admixed American, European, East Asian, Central /South Asian)
(Aim 1). Moreover, to provide an even more robust measure of genetic liability for OFCs, we will examine the
influence of 59 OFC-related 3D facial features in our OFC cohort with the goal of understanding how these traits
may interact to increase OFC risk. Each of these analyses will both consider OFCs as a singular group as well
as consider each of the individual subtypes independently. In Aim 2, we will apply sophisticated variant detection
techniques to explore the contribution of rare structural and short variation on OFCs. This will allow us to leverage
our large, aggregated OFC dataset to perform novel gene discovery by integrating rare and common genetic
signals. Finally, we will stratify the PRSs generated in Aim 1 against the rare mutations discovered in Aim 2 to
better understand how they may interact to confer OFC risk. This analysis will be further expanded by the
development of an OFC composite genetic risk score, created by integrating the OFC PRSs directly with a rare
variation risk score, to provide a more comprehensive measure of OFC genetic liability. Taken together, these
aims are poised to greatly expand our understanding of the genetic risk factors for OFC across diverse
populations and discover new genes associated with OFCs. Overall, this study will have a transformative impact
on the OFC research community with potential applications in prenatal screening, genetic counseling, and
treatments for the disorder.