Osteoarthritis (OA) is highly prevalent, contributes to substantial morbidity in the population, and lacks effective
interventions to prevent onset and progression. Importantly, and like many other chronic conditions, OA is not
a single disease but rather a heterogeneous condition consisting of multiple subgroups, or phenotypes, with
differing underlying pathophysiological mechanisms. It is becoming increasingly clear that consideration of
specific OA phenotypes in clinical studies and trials is critically needed to move the field forward. The overall
goal of this line of work is to identify and understand potential phenotypes of knee osteoarthritis (KOA)
to better inform future research efforts and treatments; this exploratory R21 project using OA Initiative
(OAI) data will investigate novel methodology to support phenotyping in KOA. Successful treatments for
OA will need to be targeted to, and tested in, specifically chosen OA phenotypes. Our hypothesis is that an
understanding of KOA phenotypes, a key step toward Precision Medicine in OA, will lead to more
successful clinical studies in the long-term. To approach this important clinical problem, we propose a
project in which we will apply innovative machine learning methods and validation strategies to data from the
large, publicly available OAI cohort. We will leverage this large dataset, along with local expertise in statistics,
biostatistics and machine learning methodology, to tackle the problem of phenotyping this heterogeneous
disease. In Aim 1, we will utilize a data-driven, unsupervised learning approach, to cluster features that best
define and discriminate among phenotypes of KOA in the OAI dataset, using biclustering and a novel
significance test (SigClust) developed by co-I Marron. For Aim 2, we will test specific hypotheses of relevance
to OA outcomes, such as differences between those with and without OA, or those who do or do not develop
new or worsening disease, using another set of machine learning methods (Direction-projection-permutation
[DiProPerm] hypothesis testing, and Distance-Weighted Discrimination [DWD]), also developed by co-I Marron,
in the full cohort and in any identified clusters from Aim 1. In order to address these aims, this proposal
involves interdisciplinary collaborations among experts in statistics, biostatistics, computer science,
rheumatology, and epidemiology. This work will significantly impact the field by fulfilling a critical need to
accurately define OA phenotypes, discover the key features associated with these phenotypes, link phenotype
subgroups to underlying mechanisms and use this information to inform and focus future clinical studies. In the
long term, we expect that this strategy will lead to more personalized and successful management of the
millions of people affected by OA.