Friday, August 15, 2025 8/15/2025

Extending the utility and performance of variant effect predictors with protein language models

Award Number: R01HG013524
ORGANIZATION: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
OPDIV: NIH
AWARD CLASS: DISCRETIONARY
AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)
PERIOD OF PERFORMANCE START DATE: 01/06/2025
PERIOD OF PERFORMANCE END DATE: 12/31/2029

Group Awards By:

View Award Description

Extending the utility and performance of variant effect predictors with protein language models - Project Summary Variant effect prediction (VEP); the process of determining the impact of amino acid alterations in a protein sequence, remains a fundamental challenge across both clinical and research domains. Despite the extensive application of existing VEP methods, their overall impact is limited, with most variants labeled “variants of unknown significance”. This research project aims to overcome these limitations in VEP, harnessing the potential of protein language models (PLMs) which have already shown widespread success in other fields, and integrating complementary sources of information, as employed by current methodologies, to enhance the understanding and prediction of genetic variants' functional impact on proteins and complex traits. The specific aims include: 1) Enhancing the core functionality of VEP models by providing robust estimates of score uncertainty and experimentally validating whole haplotype effect scores, including predictions of epistatic interactions. 2) Improving VEP model performance by integrating PLMs with external information such as 3D structural and homology data and fine-tuning them on functional assays and clinical databases. 3) Improving the discovery and clinical interpretation of functional protein-altering variants by optimally utilizing computational annotations and analyzing whole haplotype data in the context of gene-trait associations and clinical settings. This research project builds upon our strong preliminary data of PLM-based variant effect prediction, which by multiple metrics has demonstrated best-in-class performance. By leveraging PLMs and a variety of external data, this project aspires to advance the field of variant effect prediction, enabling a more profound understanding of genetic alterations, and improving diagnostic and prognostic medical exome sequencing.


Issue Date FY	Funding FY	Legal Entity Name	Legal Entity Address	Legal Entity City	Legal Entity State	Legal Entity Zip Code	Legal Entity COUNTY	Legal Entity COUNTRY	Assistance Listing	Award Code	Budget Year	Action Date	Action Type	Action Amount

Issue Date FY: 2025 ( Subtotal = $687,063 )
2025	2025	REGENTS OF THE UNIVERSITY OF CALIFORNIA, SAN FRANCISCO, THE	1855 FOLSOM ST STE 425	SAN FRANCISCO	CA	94103	SAN FRANCISCO	USA	Human Genome Research	000	1	1/6/2025	NEW	$618,356
2025	2025	REGENTS OF THE UNIVERSITY OF CALIFORNIA, SAN FRANCISCO, THE	1855 FOLSOM ST STE 425	SAN FRANCISCO	CA	94103	SAN FRANCISCO	USA	Human Genome Research	001	1	4/25/2025	NEW	$68,707
														Subtotal = $687,063

Grand Total All Awards = $687,063

Top

All Categories

About

Search

Reports

Data Submission

Award Information

Extending the utility and performance of variant effect predictors with protein language models

Award Number: R01HG013524

ORGANIZATION: NATIONAL HUMAN GENOME RESEARCH INSTITUTE

OPDIV: NIH

AWARD CLASS: DISCRETIONARY

AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)

PERIOD OF PERFORMANCE START DATE: 01/06/2025

PERIOD OF PERFORMANCE END DATE: 12/31/2029

Federal Websites

Department of Health & Human Services

HHS Operating Divisions

HHS Staff Divisions

Download A Document Viewer