Monday, February 16, 2026 2/16/2026

Learning gene regulatory networks under latent confounding and data dependence

Award Number: R01GM163245
ORGANIZATION: NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
OPDIV: NIH
AWARD CLASS: DISCRETIONARY
AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)
PERIOD OF PERFORMANCE START DATE: 08/01/2025
PERIOD OF PERFORMANCE END DATE: 04/30/2029

Group Awards By:

View Award Description

Learning gene regulatory networks under latent confounding and data dependence - Gene regulatory networks (GRNs) encode the complex regulatory relations in transcription and splicing of genes. Learning GRNs from data is a problem of fundamental importance in computational biology. In this project, we formulate GRN inference as a causal discovery problem through graphical modeling, which is an active research area in statistics and data science in its own right. Leveraging large-scale RNA-seq data generated and accumulated in the literature, we will develop statistical methods to infer the structure of GRNs and identify direct causes of gene expression and alternative splicing. The proposed methodology is motivated by two notorious difficulties in learning GRNs, namely the existence of latent confounders and potential dependence in data. We will develop a coordinated local network learning algorithm, which is robust against latent confounding and computationally efficient. By identifying the parent set of a target gene such as a transcription factor (TF), this method facilitates the identification of the regulatory effect of the TF on any other gene, without the need to learn a full network. Due to latent confounders, we propose to model a GRN by an acyclic directed mixed graph (ADMG) having both directed and bidirected edges. A bidirected edge implies the two nodes (genes) share a common latent cause or confounder. We will develop a novel method to learn the structure of ADMGs via a hybrid approach. There are a large number of single-cell RNA-seq data generated from cells with potential dependence due to temporal or spatial association. We will develop a de-correlation approach to remove cell dependence in such single-cell data so that existing GRN learning algorithms may be applied on the de-correlated data with improved accuracy. The proposed research will advance statistical methods for GRN inference under complex and realistic settings and will also make substantial contributions to the general methodology for structure learning of graphical models. We further propose a novel idea to model feedback loops in a GRN by a chain graph with latent variables based on the causal interpretation of its undirected edges. Standard causal discovery methods assume independent data. De-correlation of dependent data is an innovative idea and has the potential to substantially improve the performance of many existing methods. This approach holds great promise for fitting graphical models on RNA-seq data from dependent cell populations. RELEVANCE (See instructions): Understanding the underlying causality for gene regulation is an important problem in medical science and public health. The high-level goal of this project is to develop novel mathematical models and statistical methods to construct gene regulatory networks from biomedical data, motivated by a few practical difficulties present to available approaches. The new methods will facilitate causal discovery in many biomedical problems, such as identification of potential causes for diseases.


Issue Date FY	Funding FY	Legal Entity Name	Legal Entity Address	Legal Entity City	Legal Entity State	Legal Entity Zip Code	Legal Entity COUNTY	Legal Entity COUNTRY	Assistance Listing	Award Code	Budget Year	Action Date	Action Type	Action Amount

Issue Date FY: 2025 ( Subtotal = $267,098 )
2025	2025	UNIVERSITY OF CALIFORNIA, LOS ANGELES	10889 WILSHIRE BLVD STE 700	LOS ANGELES	CA	90024	LOS ANGELES	USA	Biomedical Research and Research Training	000	1	8/6/2025	NEW	$267,098
														Subtotal = $267,098

Grand Total All Awards = $267,098

Top

All Categories

About

Search

Reports

Data Submission

Award Information

Learning gene regulatory networks under latent confounding and data dependence

Award Number: R01GM163245

ORGANIZATION: NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES

OPDIV: NIH

AWARD CLASS: DISCRETIONARY

AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)

PERIOD OF PERFORMANCE START DATE: 08/01/2025

PERIOD OF PERFORMANCE END DATE: 04/30/2029

Federal Websites

Department of Health & Human Services

HHS Operating Divisions

HHS Staff Divisions

Download A Document Viewer