Project Summary/Abstract
Randomized controlled trials (RCTs) are a cornerstone of evidence-based medicine and are placed high in the
“evidence pyramid”. When rigorously designed, conducted, and reported, they provide the most robust evidence
on effectiveness of therapeutic interventions. However, they commonly suffer from various types of biases (e.g.,
selection bias, attrition bias) in study design and execution. In reporting, key methodological characteristics such
as randomization and blinding are often omitted, making it difficult to assess the validity and applicability of trial
findings. Adherence to reporting guidelines can improve transparency and completeness of reporting for
biomedical studies. SPIRIT and CONSORT guidelines help authors report RCT protocols and results
publications, respectively. Although endorsed by many high-impact medical journals, adherence to these
guidelines remains suboptimal, possibly because journals lack methods for enforcement and verification, which
involves a substantial amount of journal staff or editorial time. Furthermore, transparent reporting does not
guarantee methodological rigor. We hypothesize that natural language processing (NLP) methods underpinned
by SPIRIT/CONSORT guidelines as well as terminological and ontological resources for clinical research can
(a) improve compliance by locating key study characteristics in RCT reports and flagging their absence, and (b)
support automated rigor assessment and large-scale methodological research by extracting granular machine-
readable methodological information from RCT reports. To achieve these goals, we specifically aim to:
Aim 1. Create text classification models for assessing transparency and completeness of RCT reports consistent
with SPIRIT and CONSORT guidelines.
Aim 2. Develop information extraction methods to identify methodological characteristics in RCT reports.
Aim 3. Build a web-based compliance tool that generates reports on transparency and guideline adherence of
RCT reports.
Aim 4. Generate structured transparency reports from published RCT literature for analysis of methodology and
reporting quality.
The proposed research will develop a set of models, resources, and tools that will assist stakeholders of clinical
research in maintaining high reporting standards, synthesizing evidence, and promoting open science practices.
They will contribute to improvements throughout the scientific ecosystem, leading to better clinical care and
health policy.