Saturday, September 13, 2025 9/13/2025

THS: Using Twitter and Big Data Analytics to Track and Predict Health Conditions

Award Number: R15LM012275
ORGANIZATION: NATIONAL LIBRARY OF MEDICINE
OPDIV: NIH
AWARD CLASS: DISCRETIONARY
AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)
PERIOD OF PERFORMANCE START DATE: 09/17/2015
PERIOD OF PERFORMANCE END DATE: 08/31/2025

Group Awards By:

View Award Description

THS: Using Twitter and Big Data Analytics to Track and Predict Health Conditions - Project Summary/Abstract U.S. health officials are struggling to keep up with information and misinformation related to health conditions, natural disasters, and disease outbreaks affecting communities nationwide. Early warnings about such events can be found in public postings made by citizens using social networks like Twitter. However, the sheer volume of messages posted each day, and the real possibility of false content makes it very difficult to rely on these data for guidance, education, and decision making. Deep learning and big data solutions could be used to tackle this problem by providing the means to collect, classify, and validate these messages, sorting out actionable data from noise. But deep learning models are hard to train and tune, requiring data sets with thousands of examples. Our long-term goal is to understand how to build, deploy, and maintain an integrated and scalable platform to search social media posts and analyze their contents in search for clues about health conditions. In this project, our overall objective is to develop the technology needed to integrate search queries and deep learning models that are run against social media data to detect conversations that can throw clues about emerging topics, determine the intent of the messages (e.g., opinion, advise), and to find and group together individual messages that are similar in content. Our central hypothesis is that we can reduce query time for message search, increase classifier accuracy and precision for health topic detection, and simplify model training and deployment through the use of transfer learning, Generative Adversarial Networks (GANs), and similarity-search models based on neural networks. In Specific Aim 1 we will develop supervised methods to support accurate message similarity search. We shall use Siamese neural networks to compute a similarity score between tweets and rank them according to this score. In Specific Aim 2 we will implement data augmentation via GANs to improve model training time and accuracy. Our GANs will generate synthetic tweets that are realistic enough to help users produce good training data with less manual effort and yet produce well-trained models. As a proof of concept, we shall harden our existing open-source THS system by adding these capabilities. Our project is novel because THS is the first system of its kind, providing a “social data warehouse” to collect, store, integrate, index, and analyze Twitter data in an open source platform. Its significance stems from the ability to work as a tool to help health officials analyze tweets, visualize data along disease and spatio-temporal attributes, and make predictive analytics, all under one roof. This could have a significant impact on public health disease tracking and response. UPRM is a Hispanic serving institution, with the second largest Hispanic serving engineering school in the U.S. and with 35% female enrollment. This project provides a unique opportunity to train students in social media analysis, big data systems, and machine learning. The success of this project could open new opportunities for UPRM researchers to participate in collaborative NIH proposals with other institutions.


Issue Date FY	Funding FY	Legal Entity Name	Legal Entity Address	Legal Entity City	Legal Entity State	Legal Entity Zip Code	Legal Entity COUNTY	Legal Entity COUNTRY	Assistance Listing	Award Code	Budget Year	Action Date	Action Type	Action Amount

Issue Date FY: 2024 ( Subtotal = $0 )
2024	2021	UNIVERSITY OF PUERTO RICO	259 BLVD ALFONSO VALDES	MAYAGUEZ	PR	00680	MAYAGUEZ	USA	Medical Library Assistance	000	2	9/23/2024	COMPETING CONTINUATION	$0
														Subtotal = $0

Issue Date FY: 2022 ( Subtotal = -$1,924 )
2022	2015	UNIVERSITY OF PUERTO RICO	259 BLVD ALFONSO VALDES	MAYAGUEZ	PR	00680	MAYAGUEZ	USA	Medical Library Assistance	000	1	10/1/2021	NEW	-$1,924
														Subtotal = -$1,924

Issue Date FY: 2021 ( Subtotal = $308,431 )
2021	2021	UNIVERSITY OF PUERTO RICO	259 BLVD ALFONSO VALDES	MAYAGUEZ	PR	00680	MAYAGUEZ	USA	Medical Library Assistance	001	2	9/10/2021	COMPETING CONTINUATION	$308,431
2021	2015	UNIVERSITY OF PUERTO RICO	259 BLVD ALFONSO VALDES	MAYAGUEZ	PR	00680	MAYAGUEZ	USA	Medical Library Assistance	000	1	11/24/2020	NEW	$0
														Subtotal = $308,431

Issue Date FY: 2020 ( Subtotal = $0 )
2020	2015	UNIVERSITY OF PUERTO RICO	259 BLVD ALFONSO VALDES	MAYAGUEZ	PR	00680	MAYAGUEZ	USA	Medical Library Assistance	000	1	5/28/2020	NEW	$0
														Subtotal = $0

Issue Date FY: 2019 ( Subtotal = $0 )
2019	2015	UNIVERSITY OF PUERTO RICO	259 BLVD ALFONSO VALDES	MAYAGUEZ	PR	00680	MAYAGUEZ	USA	Medical Library Assistance	000	1	10/24/2018	NEW	$0
2019	2015	UNIVERSITY OF PUERTO RICO	259 BLVD ALFONSO VALDES	MAYAGUEZ	PR	00680	MAYAGUEZ	USA	Medical Library Assistance	001	1	8/1/2019	NEW	$0
														Subtotal = $0

Issue Date FY: 2018 ( Subtotal = $0 )
2018	2015	UNIVERSITY OF PUERTO RICO	259 BLVD ALFONSO VALDES	MAYAGUEZ	PR	00680	MAYAGUEZ	USA	Medical Library Assistance	000	1	9/1/2018	NEW	$0
														Subtotal = $0

Issue Date FY: 2016 ( Subtotal = $0 )
2016	2015	UNIVERSITY OF PR-MAYAGUEZ CAMPUS	PO BOX 9001	MAYAGUEZ	PR	006819001	MAYAGUEZ	USA	Medical Library Assistance	000	1	8/25/2016	NEW	$0
														Subtotal = $0

Issue Date FY: 2015 ( Subtotal = $312,651 )
2015	2015	UNIVERSITY OF PR-MAYAGUEZ CAMPUS	PO BOX 9001	MAYAGUEZ	PR			USA	Medical Library Assistance	000	1	9/17/2015	NEW	$312,651
														Subtotal = $312,651

Grand Total All Awards = $619,158

Top

All Categories

About

Search

Reports

Data Submission

Award Information

THS: Using Twitter and Big Data Analytics to Track and Predict Health Conditions

Award Number: R15LM012275

ORGANIZATION: NATIONAL LIBRARY OF MEDICINE

OPDIV: NIH

AWARD CLASS: DISCRETIONARY

AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)

PERIOD OF PERFORMANCE START DATE: 09/17/2015

PERIOD OF PERFORMANCE END DATE: 08/31/2025

Federal Websites

Department of Health & Human Services

HHS Operating Divisions

HHS Staff Divisions

Download A Document Viewer