Thursday, March 19, 2026 3/19/2026

Robust and cost-effective computational methods for haplotype-resolved genome assemblies

Award Number: K99HG012798
ORGANIZATION: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
OPDIV: NIH
AWARD CLASS: DISCRETIONARY
AWARD ACTIVITY TYPE: TRAINING/TRAINEESHIP
PERIOD OF PERFORMANCE START DATE: 02/13/2023
PERIOD OF PERFORMANCE END DATE: 07/01/2024

Group Awards By:

View Award Description

Robust and cost-effective computational methods for haplotype-resolved genome assemblies - Abstract Background: De nova haplotype-resolved genome assembly not only plays a critical role in the studies of novel species, but also is the most comprehensive solution to discover structural variants and understand repeat-rich regions of the human genome. Moreover, haplotype-resolved assemblies are the fundamental infrastructures for various pangenome references. Recent advances in accurate long-read sequencing technologies open the opportunity to faithfully build high-quality haplotyperesolved assemblies, but most assembly algorithms could not take full advantage of the emerging accurate long-read data. To this end, I have developed a graph-based haplotype-resolved genome assembly algorithm, called hifiasm, which combines accurate long reads with the additional data providing long-range phasing information. Hifiasm has been widely used by multiple large-scale sequencing projects, such as the Human Pangenome Reference Consortium (HPRC), the Genome in a Bottle (GIAB), the Vertebrate Genomes Project (VGP), and the Darwin Tree of Life project. Based on hifiasm, this proposal focuses on developing a set of new haplotype-resolved assembly algorithms to further improve the assembly quality for complex regions and genomes, as well as substantially reduce the assembly cost. Research: My first aim is to develop a hybrid algorithm to produce high-quality haplotype-resolved assemblies for diploid genomes, especially focusing on resolving highly repetitive regions like centromeres. The proposed algorithm will combine the advantages of length and accuracy from different types of long-read data to automatically reconstruct the last unexplored repeat-rich regions of the genome. In the second aim, I will develop a haplotype-aware scaffolding algorithm to achieve chromosome-level haplotype-resolved assemblies for diploid genomes. In the third aim, I will propose different strategies to reduce the sequencing cost and the computational cost of the haplotype-resolved assembly, making it feasible for populationscale studies. I will also develop assembly algorithms to resolve complex genomes including not only two haplotypes. Upon completion, the proposed studies will offer efficient assembly tools for large-scale sequencing projects, and will pave the way to personal genome assembly for genomic research and clinical applications. Career development and training: My long-term career goal is to lead an independent research group focusing on developing novel computational methods for haplotype-resolved assemblies and the relevant applications. In addition to further enhancing my training in computational method development with my mentor Dr. Heng Li, I will obtain systematic training in biomedical research from the advisory committee (Dr. Erich D. Jarvis and Dr. Scott V. Edwards for human and non-human genomes, Dr. Evan E. Eichler and Dr. Karen H. Miga for repeats and structural variations, as well as Dr. Matthew Meyerson for complex genomes including not only two haplotypes). The training in career development, including laboratory management, grant-writing and leadership, will also be carried out during the K99 phase. My experience in computational method development, especially in genome assemblies, as well as the rigorous mentored support from my mentoring and advisory team, puts me in a unique position to establish an independent lab studying haplotype-resolved genome assembly algorithms.


Issue Date FY	Funding FY	Legal Entity Name	Legal Entity Address	Legal Entity City	Legal Entity State	Legal Entity Zip Code	Legal Entity COUNTY	Legal Entity COUNTRY	Assistance Listing	Award Code	Budget Year	Action Date	Action Type	Action Amount

Issue Date FY: 2025 ( Subtotal = $0 )
2025	2024	DANA-FARBER CANCER INSTITUTE, INC.	450 BROOKLINE AVE	BOSTON	MA	02215	SUFFOLK	USA	Human Genome Research	000	2	11/26/2024	NON-COMPETING CONTINUATION	$0
														Subtotal = $0

Issue Date FY: 2024 ( Subtotal = $42,888 )
2024	2024	DANA-FARBER CANCER INSTITUTE, INC.	450 BROOKLINE AVE	BOSTON	MA	02215	SUFFOLK	USA	Human Genome Research	000	2	1/17/2024	NON-COMPETING CONTINUATION	$122,262
2024	2024	DANA-FARBER CANCER INSTITUTE, INC.	450 BROOKLINE AVE	BOSTON	MA	02215	SUFFOLK	USA	Human Genome Research	001	2	9/23/2024	NON-COMPETING CONTINUATION	-$79,374
														Subtotal = $42,888

Issue Date FY: 2023 ( Subtotal = $122,262 )
2023	2023	DANA-FARBER CANCER INSTITUTE, INC.	450 BROOKLINE AVE	BOSTON	MA	02115	SUFFOLK	USA	Human Genome Research	000	1	2/13/2023	NEW	$122,262
														Subtotal = $122,262

Grand Total All Awards = $165,150

Top

All Categories

About

Search

Reports

Data Submission

Award Information

Robust and cost-effective computational methods for haplotype-resolved genome assemblies

Award Number: K99HG012798

ORGANIZATION: NATIONAL HUMAN GENOME RESEARCH INSTITUTE

OPDIV: NIH

AWARD CLASS: DISCRETIONARY

AWARD ACTIVITY TYPE: TRAINING/TRAINEESHIP

PERIOD OF PERFORMANCE START DATE: 02/13/2023

PERIOD OF PERFORMANCE END DATE: 07/01/2024

Federal Websites

Department of Health & Human Services

HHS Operating Divisions

HHS Staff Divisions

Download A Document Viewer