Wednesday, October 22, 2025 10/22/2025

Statistical Approaches to Unlock Protein Function from Deep Mutational Scans

Award Number: R35GM160065
ORGANIZATION: NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
OPDIV: NIH
AWARD CLASS: DISCRETIONARY
AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)
PERIOD OF PERFORMANCE START DATE: 08/01/2025
PERIOD OF PERFORMANCE END DATE: 07/31/2030

Group Awards By:

View Award Description

Statistical Approaches to Unlock Protein Function from Deep Mutational Scans - Project Summary/Abstract Understanding how genetic variants impact protein function is essential for unraveling the mechanisms underlying both basic biology and disease, particularly for rare genetic variants. Of the 4.6 million missense variants found in large population studies, only about 2% have clinical interpretations. Due to their rarity, these variants are exceptionally challenging to study through observational methods. However, Deep Mutational Scanning (DMS) offers a high-throughput method for testing thousands of protein variants by generating a mutant library and obtaining a phenotypic readout for each mutation in one sequencing assay. Initially focused on fitness-based readouts, DMS has expanded to include fluorescence-based methods for protein profiling, binding assays, and more. It has been crucial for studying proteins like SARS-CoV-2, BRCA1, and drug-metabolism transporters like OCT1. With over 1,000 protein datasets publicly available, a recent study highlights technical advances by independently assaying over 500 additional proteins in one study. Unfortunately, the development of statistical methods to interpret and analyze these technologies has not kept pace. For example, DMS with fluorescence-activated cell sorting (DMS-FACS), which has been used for nearly a decade to measure protein abundance and other functional phenotypes, still lacks dedicated analysis methods. As a result, analyses are often ad hoc, and small sample sizes (typically three replicates) make standard statistical methods unsuitable. Our recent work demonstrates that naive approaches miss many real effects and lead to many false discoveries. We propose three statistical areas to improve DMS analysis and interpretation through accurate sample comparisons, epistasis analysis, and causal inference. First, we will develop methods to analyze DMS-FACS for assessing how genetic variants affect molecular phenotype targeted by FACS, and enabling precise comparisons between experimental conditions. Second, we will develop methods to improve genetic interaction (epistasis) analysis and interpretation within proteins, and thus ask which protein regions are acting in concert. Third, we open a new area of research for DMS, aiming to identify the causal impact of variants through measured pathways, including complex traits. In summary, we will solve the analysis gap for DMS-FACS, epistasis DMS, and causally link DMS data through structural causal models by leveraging our expertise in DMS data and small sample statistics. Leveraging our expertise in DMS data and small sample statistics, we will create reliable, robust tools for common workflows while also enabling new types of analyses that improve the interpretation of DMS, epistasis, and phenotypic relationships. With strong collaborations with assay developers and DMS experts, along with a proven track record in developing tools for high-throughput sequencing in small sample contexts, we are well-positioned to lead this effort.


Issue Date FY	Funding FY	Legal Entity Name	Legal Entity Address	Legal Entity City	Legal Entity State	Legal Entity Zip Code	Legal Entity COUNTY	Legal Entity COUNTRY	Assistance Listing	Award Code	Budget Year	Action Date	Action Type	Action Amount

Issue Date FY: 2025 ( Subtotal = $420,984 )
2025	2025	UNIVERSITY OF CALIFORNIA, LOS ANGELES	10889 WILSHIRE BLVD STE 700	LOS ANGELES	CA	90024	LOS ANGELES	USA	Biomedical Research and Research Training	000	1	7/25/2025	NEW	$420,984
														Subtotal = $420,984

Grand Total All Awards = $420,984

Top

All Categories

About

Search

Reports

Data Submission

Award Information

Statistical Approaches to Unlock Protein Function from Deep Mutational Scans

Award Number: R35GM160065

ORGANIZATION: NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES

OPDIV: NIH

AWARD CLASS: DISCRETIONARY

AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)

PERIOD OF PERFORMANCE START DATE: 08/01/2025

PERIOD OF PERFORMANCE END DATE: 07/31/2030

Federal Websites

Department of Health & Human Services

HHS Operating Divisions

HHS Staff Divisions

Download A Document Viewer