Monday, August 18, 2025 8/18/2025

Addressing Factual Inaccuracy and Unfaithful Reasoning of Large Language Models in Biomedicine and Healthcare

Award Number: R01LM014604
ORGANIZATION: NATIONAL LIBRARY OF MEDICINE
OPDIV: NIH
AWARD CLASS: DISCRETIONARY
AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)
PERIOD OF PERFORMANCE START DATE: 08/29/2024
PERIOD OF PERFORMANCE END DATE: 07/31/2028

Group Awards By:

View Award Description

Addressing Factual Inaccuracy and Unfaithful Reasoning of Large Language Models in Biomedicine and Healthcare - Large Language Models (LLMs) represent the latest advancement in Natural Language Processing (NLP) and Artificial Intelligence (AI), holding tremendous potential to revolutionize biomedical and healthcare applications. Extensive research has demonstrated the effectiveness of LLMs in a range of biomedical and health applications, ranging from medical question answering to summarizing systematic reviews and AI-assisted disease diagnosis. However, the major barriers to applying LLMs in biomedical and health applications are factual incorrectness – where LLMgenerated responses are inaccurate or incomplete – and incorrect reasoning – where LLM-generated responses lack supporting evidence, contradict existing evidence, or even rely on hallucinated evidence. Such issues further pose the risk of propagating errors, potentially leading to incorrect diagnosis or treatment recommendations. Addressing these issues has been challenging, primarily due to three fundamental obstacles: (1) from the data perspective, LLMs may capture errors from lower-quality or unauthorized sources in the general domain data during pretraining, lack access to accurate and up-to-date biomedical knowledge, and consequently generate inaccurate, or outdated results; (2) from the methods perspective, there is a lack of mechanisms for fact-checking and evidence attribution throughout the lifecycle of LLMs when applied to biomedical and health studies, spanning from training/fine-tuning to inference and posthoc analysis; (3) from the accountability perspective, few approaches have evaluated their effectiveness in biomedical and health downstream applications. Our overall objective in this proposal is to systematically address the issue of factuality and reasoning of LLMs in biomedicine and healthcare. The specific aims include (1) from the data perspective, establishing a self-augmentation framework to teach LLMs to automatically select and use relevant biomedical digital resources to augment their responses; (2) from the methods perspective, developing an LLM curator by stimulating fact-checking and evidence attribution performed in biocuration via a multi-stage, multitask instruction tuning pipeline; (3) from the methods perspective, introducing a steplevel automated feedback-guided paradigm for LLMs to reflect and improve from its intermediate responses via fact-checking and evidence attribution; and (4) from the accountability perspective, evaluating the methods in downstream use cases. The proposed work is expected to address factuality and reasoning issues of LLMs – the key barrier to their use in biomedical and health domains – and make LLMs generate accurate responses to advance biomedical discovery and healthcare. It is also expected to refine the current development and evaluation pipelines of LLMs in biomedical and health domains by making fact-checking and evidence attribution essential components and providing related benchmarks, methods, and tools to facilitate the implementation.


Issue Date FY	Funding FY	Legal Entity Name	Legal Entity Address	Legal Entity City	Legal Entity State	Legal Entity Zip Code	Legal Entity COUNTY	Legal Entity COUNTRY	Assistance Listing	Award Code	Budget Year	Action Date	Action Type	Action Amount

Issue Date FY: 2025 ( Subtotal = $375,964 )
2025	2025	YALE UNIV	150 MUNSON ST	NEW HAVEN	CT	06511	SOUTH CENTRAL CT	USA	Medical Library Assistance	000	2	8/1/2025	NON-COMPETING CONTINUATION	$375,964
														Subtotal = $375,964

Issue Date FY: 2024 ( Subtotal = $375,964 )
2024	2024	YALE UNIV	150 MUNSON ST	NEW HAVEN	CT	06511	SOUTH CENTRAL CT	USA	Medical Library Assistance	000	1	8/29/2024	NEW	$375,964
														Subtotal = $375,964

Grand Total All Awards = $751,928

Top

All Categories

About

Search

Reports

Data Submission

Award Information

Addressing Factual Inaccuracy and Unfaithful Reasoning of Large Language Models in Biomedicine and Healthcare

Award Number: R01LM014604

ORGANIZATION: NATIONAL LIBRARY OF MEDICINE

OPDIV: NIH

AWARD CLASS: DISCRETIONARY

AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)

PERIOD OF PERFORMANCE START DATE: 08/29/2024

PERIOD OF PERFORMANCE END DATE: 07/31/2028

Federal Websites

Department of Health & Human Services

HHS Operating Divisions

HHS Staff Divisions

Download A Document Viewer