Saturday, November 15, 2025 11/15/2025

Learn, transfer, generate: Developing novel deep learning models for enhancing robustness and accuracy of small-scale single-cell RNA sequencing studies

Award Number: F31HG012718
ORGANIZATION: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
OPDIV: NIH
AWARD CLASS: DISCRETIONARY
AWARD ACTIVITY TYPE: FELLOWSHIP/SCHOLARSHIP/STUDENT LOANS
PERIOD OF PERFORMANCE START DATE: 01/01/2023
PERIOD OF PERFORMANCE END DATE: 12/31/2023

Group Awards By:

View Award Description

Learn, transfer, generate: Developing novel deep learning models for enhancing robustness and accuracy of small-scale single-cell RNA sequencing studies - Project Summary. Single-cell RNA-sequencing (scRNAseq) technologies measure transcriptome-wide gene expression at the single-cell level. In contrast to bulk RNA-sequencing, scRNAseq can elucidate dynamic expression patterns between different cellular populations. A key problem in scRNAseq studies is the inability to transfer knowledge between independent sequencing studies directly. As a result, it has been necessary for researchers to spend a significant amount of time and resources generating massive datasets to enable meaningful analyses, a process that is costly and often not reproducible. Another transformative technology is spatial transcriptomics (ST), which provides genetic profiles of cells while containing the positional information on the sequenced cell. ST has the potential to expand our understanding of cellular heterogeneity, interactions, and pathology; however, ST is still an emerging technology and is not widely available for many studies. This proposal will fulfill the unmet need for scalable algorithms that transfer knowledge from existing datasets to new studies, leveraging learned representations to construct the sequenced tissue's spatial information. I propose to achieve these goals through the following aims: (1) Transfer knowledge from existing public single- cell data to new experimental data using a deep neural-attention network, and (2) develop the first spatially- informed model for generating realistic scRNAseq data. In Aim 1, I will use the attention mechanisms (which have revolutionized many fields in computer science) to learn complex gene dependencies intelligently and learn important biological features (e.g., marker genes) in a fully self-supervised manner, providing biological interpretability that is desperately needed. Such a model can be used in many tasks and for datasets with relatively few samples. The learned knowledge obtained from Aim 1 will be used directly in Aim 2. In Aim 2, I will build upon our state-of-the-art generative model to generate synthetic data that contains spatial information (coordinates) of sequenced cells, even when no atlas is available. This model will allow researchers to produce synthetic data with spatial information and augment sparse and noisy datasets for more robust and accurate analyses, all possible without the need for additional costly experiments. This proposal will support my dissertation research, which will be the foundational body of work for my career as a researcher in computational genomics. During the tenure of this award, I will receive specialized training in the underlying mathematics and biology needed for developing frameworks for scRNAseq analysis. I will contribute to the existing literature by developing novel methodology and creating open-source software, making our tools and models easily accessible to the broader scientific community. Achieving the proposed aims will significantly enhance scRNAseq pipelines and analysis, making them more robust and accurate. This will additionally facilitate the study of smaller datasets, potentially reducing the number of patients and animals necessary in initial studies.


Issue Date FY	Funding FY	Legal Entity Name	Legal Entity Address	Legal Entity City	Legal Entity State	Legal Entity Zip Code	Legal Entity COUNTY	Legal Entity COUNTRY	Assistance Listing	Award Code	Budget Year	Action Date	Action Type	Action Amount

Issue Date FY: 2025 ( Subtotal = $0 )
2025	2022	UNIVERSITY OF CALIFORNIA, MERCED	5200 N LAKE RD	MERCED	CA	95343	MERCED	USA	Human Genome Research	000	1	10/23/2024	NEW	$0
														Subtotal = $0

Issue Date FY: 2024 ( Subtotal = $0 )
2024	2022	UNIVERSITY OF CALIFORNIA, MERCED	5200 N LAKE RD	MERCED	CA	95343	MERCED	USA	Human Genome Research	000	1	12/27/2023	NEW	$0
														Subtotal = $0

Issue Date FY: 2022 ( Subtotal = $37,617 )
2022	2022	UNIVERSITY OF CALIFORNIA MERCED	5200 N LAKE RD	MERCED	CA	95343	MERCED	USA	Human Genome Research	001	1	7/26/2022	NEW	$0
2022	2022	UNIVERSITY OF CALIFORNIA MERCED	5200 N LAKE RD	MERCED	CA	95343	MERCED	USA	Human Genome Research	000	1	7/26/2022	NEW	$37,617
														Subtotal = $37,617

Grand Total All Awards = $37,617

Top

All Categories

About

Search

Reports

Data Submission

Award Information

Learn, transfer, generate: Developing novel deep learning models for enhancing robustness and accuracy of small-scale single-cell RNA sequencing studies

Award Number: F31HG012718

ORGANIZATION: NATIONAL HUMAN GENOME RESEARCH INSTITUTE

OPDIV: NIH

AWARD CLASS: DISCRETIONARY

AWARD ACTIVITY TYPE: FELLOWSHIP/SCHOLARSHIP/STUDENT LOANS

PERIOD OF PERFORMANCE START DATE: 01/01/2023

PERIOD OF PERFORMANCE END DATE: 12/31/2023

Federal Websites

Department of Health & Human Services

HHS Operating Divisions

HHS Staff Divisions

Download A Document Viewer