Software for Geostatistical Smoothing and Joinpoint Regression Modeling of Time Series of Compositional Variables in Epidemiology - 7. Project Summary/Abstract
Joinpoint regression developed by the NCI Surveillance Research Program is increasingly used to identify the
timing and extent of changes in time series of health outcomes and to project future disease burden. Many
analyses of population data (e.g., cancer stage at diagnosis, causes of death, and patterns of health behavior),
including those often used in joinpoint regression, are based on percentages or proportions, as the focus is
typically on relative, not absolute frequencies. Such time series of compositional variables need to be modeled
simultaneously to guarantee the coherency of the individual temporal trends; that is the predicted percentages
sum to 100% at each time step. Analyzing temporal trends outside a spatial framework is also unsatisfactory
because significant variation even within a single State is not accounted for. Modeling multivariate time series
in a spatial framework is a significant theoretical, methodological and computational challenge that is not
tackled by the NCI software. This research will address this need using spatial compositional data analysis
(CoDa) whereby geostatistical noise-filtering and time trend modeling is conducted on logtransforms of ratios.
This SBIR project is developing the first commercial software to offer tools for spatial geostatistical noise-
filtering and joinpoint regression analysis of time series of compositional variables in epidemiology. The
research product will be a stand-alone desktop space-time (ST) analysis and visualization tool, building on the
legacy core software developed by BioMedware. These tools will be suited for the analysis of data outside
health sciences, such as in geochemistry, economy or soil science, broadening significantly the commercial
market for the end product. This project will accomplish three aims:
Develop a simulation-based methodology to propagate the uncertainty caused by the small number
problem through the computation of the main types of log-ratio transform available in the CoDa literature
and compare the robustness of subsequent analysis with respect to this noise. The modeling of temporal
trends by joinpoint regression will be adapted to the compositional nature of the data. These are all novel
approaches that are currently not available in the statistical literature.
Develop and test a prototype module that will implement novel methods (propagation of uncertainty,
modeling of multivariate temporal trends) developed under Objective #1 into BioMedware’s space-time
visualization and analysis technology (Vesta software).
Conduct a usability and user experience study and identify additional methods and tools for Phase II work,
including the first CoDa advisor to guide the user through the selection and interpretation of appropriate
data representations based on the type of data (i.e., continuous vs count data, rounded vs true zeros).
These technologic, scientific and commercial innovations will enhance our ability to incorporate compositional
data into any epidemiological problem with an underlying spatial or temporal reference.