ABSTRACT
Clinical trials are often conducted under idealized and rigorously controlled conditions to ensure internal validity
(maximizing potential treatment efficacy) while balancing patient safety (e.g., serious adverse events [SAEs]);
but these conditions—often reflected in trials’ eligibility criteria—paradoxically, limits (1) the ability to identify the
“right” study populations of the trials, and (2) the trials’ generalizability to the target population in real-world
settings. Low generalizability has long been a concern, including for Alzheimer's disease (AD) trials. AD trial
participants are systematically younger than AD patients in the general population, where eligibility criteria design
issues are arguably the biggest yet modifiable barriers. The FDA has launched numerous initiatives to improve
trial design and enrollment practices, such as using enrichment strategies (e.g., “use patient characteristic to
select a study population in which detection of a drug effect [or safety event] is more likely than it would be in an
unselected population”), so that the trial participants can better reflect the real-world target population and the
trials are more likely to succeed. However, there are significant gaps between the need to improve AD trial
eligibility criteria design and ways available to fulfill the need in practice. On the other hand, rapid adoption of
electronic health record (EHR) systems has made large collections of real-world data (RWD) that reflect the
characteristics and outcomes of the patients being treated in real-world settings, available for research. The
increasing availability of RWD combined with the advancements in artificial intelligence (AI), especially
machine learning (ML) offer untapped opportunities to generate real-world evidence (RWE) to support
eligibility criteria design for AD trials, due to a number of key methodological gaps: (1) the lack of validated
computable phenotyping (CP) and natural language processing (NLP) algorithms and tools that can
accurately define the populations (e.g., AD patients) of interest and extract key relevant patient characteristics
and outcomes of interest (e.g., trial endpoints such as MoCA and safety profile such as SAEs) from RWD, (2)
the lack of ways to identify the desired study populations (and corresponding eligibility criteria), considering the
impact of criteria to potential treatment effectiveness, patient safety, and study generalizability, and (3) the need
of an easy-to-use toolbox to support trialists’ eligibility criteria design process. We propose (1) novel causal-
principled, explainable AI (XAI) approaches to generate RWE to facilitate AD trial eligibility criteria design, and
(2) to create the web-based ALZHEIMER'S DISEASE ELIGIBILITY EXPLAINER (ADEP) tool. We will leverage two
large RWD resources, the OneFlorida+ (~19 million patients from Florida, Georgia, and Alabama) and INSIGHT
(~12 million New Yorkers) clinical research networks (CRNs) contributing to the national Patient-Centered
Clinical Research Network (PCORnet). The success of this project will establish (1) a novel, generalizable, and
XAI-based trial enrichment framework with large collections of distributed RWD, and (2) a prototype toolbox that
can provide RWE to eligibility criteria design, balancing effectiveness and patient safety in the target population.