Many current recommendations for dietary intake (DI) and physical activity (PA) to maintain optimal health and
minimize risks for chronic health conditions, such as obesity and type 2 diabetes (T2D), are based on statistical
analyses of data prone to measurement error, including those collected from self-reported questionnaires and
wearable devices. Self-reported measures based on food frequency questionnaires are often used in DI
assessments, however, they are prone to recall bias. Wearable devices enable the continuous monitoring of
PA but generate complex functional data with poorly characterized systematic errors. Our work and that of
others established that failure to account for measurement errors associated with scalar-valued covariates can
lead to severely biased estimates, the impacts of function-valued covariates prone to complex heteroscedastic
errors or mixtures of error-prone functional and scalar covariates are not well understood. Most work on
functional data views the data as smooth, latent curves obtained at discrete time intervals with some random
noise that is often regarded as a random process with mean zero and constant variance. By viewing this noise
as homoscedastic and independent, potential serial correlations are ignored. However, our preliminary studies
indicate that failure to account for these serial correlations in error-prone function-valued covariates can severely
bias estimations. Additionally, while classification methods of PA patterns using device-based PA data have
been proposed, there is limited work to correct for heteroscedastic measurement errors when classifying error-
prone function-valued covariates, such as device-based PA data. With the increased availability of complex,
massive high-dimensional function- and scalar-valued biomedical data, the need to correct for measurement
error biases within these datasets to permit their accurate evaluation in various regression settings is critical.
This project will address these current data limitations by developing novel statistical methods that correct for
the complex mixtures of measurement errors associated with device-based PA and self-reported measures of
DI applied to obesity and T2D research. Our primary objective is to investigate health outcome-related complex
covariate relationships in various U.S. subpopulations by designing and applying statistical models that correct
for error-prone DI and PA data biases. Aim 1: Identify latent groups of PA patterns based on device-based
functional curves prone to heteroscedastic measurement errors and determine the association between
identified PA patterns and T2D status, adjusting for PA biomarkers, age, sex, and race. Aim 2: Assess impacts
of measurement error in DI and PA data on the quantile functions of FMI and BMI, adjusting T2D status, age,
race, and sex. Aim 3: Construct generalized functional linear regression models with error-prone function- and
scalar-valued covariates to evaluate the influence of PA and DI on T2D status. This project will overcome current
analytic barriers to accurately evaluating the effects of DI and PA on obesity-related health outcomes.