Project Summary/Abstract
The multitude of candidate cancer biomarkers being discovered across various laboratories hold great
potential to enhance the practice of precision medicine. However, it is a long and challenging process –
often culminating in failure – to rigorously develop and validate these biomarkers before they can be used
in clinical practice. In particular, phase III, IV, and V biomarker validation studies are expensive and time-
consuming to conduct; it is essential to carefully design and analyze these studies and to make the most
ef¿cient use of the specimens collected. Motivated by our collaborative work on biomarker development
for cancer early detection, this proposal seeks to develop cutting-edge statistical tools for analyzing phase
III and IV biomarker studies in order to accelerate the biomarker development process. The methods
proposed in Aim 1 target the selection of primary endpoints and inference procedures to accommodate
potential overdiagnosis when assessing screening ef¿cacy in phase IV trials. The methods proposed in Aim
2 enable the combination of phase IV samples with phase III samples in phase III biomarker development.
The methods proposed in Aim 3 integrate information from heterogeneous study cohorts (which differ in
screening modalities and eligibility criteria) when estimating design parameters for biomarker clinical utility
trials.
Our statistical methods will have immediate applications to analysis of data from two cancer applica-
tions: i) the New Onset Diabetes (NOD) Cohort study and the Early Detection Initiative (EDI) study for
pancreatic cancer early detection, and ii) ¿ve low-dose CT (LDCT) screening cohorts and the Prostate,
Lung, Colorectal, and Ovarian Cancer Screening (PLCO) trial for lung cancer screening. Moreover, the
developed methodology will have broader application in other phase III and IV cancer biomarker studies
and will be valuable for advancing the NCI Early Detection Research Network (EDRN)'s current priority in
designing biomarker clinical utility trials. All statistical programs and algorithms developed in this proposal
will be made freely available to the public.