Project Summary
B progenitor acute lymphoblastic leukemia (B-ALL) remains a leading cause of childhood cancer death. With the
advances in RNA sequencing (RNA-seq) technology, many recurrent chimeric genes have been identified that
has led to refined classification of B-ALL and tailored therapies. Still, around 10-30% B-ALL cases could not be
classified into the established subtypes, which are termed as “B-other”, thus general chemotherapy will be
applied and the outcome for many is poor. This study will apply integrative genomic data analysis to identify
novel B-ALL subtypes with a focus on B-other cases. With the experience and skills from prior work, I will analyze
RNA-seq data from over 2000 childhood and adult ALL cases and define novel subtypes based on distinct gene
expression profiles and shared genetic alterations. Case lacking driver lesions from RNA-seq will be subjected
to whole genome sequencing (WGS) to identify various genetic alterations. The remaining unclassified cases
with the genetic alterations in non-coding regions will be studied by functional genomic data (ChIP and ATAC-
seq) to provide mechanistic annotation. Furthermore, functional experiments will be performed to explore the
role of the newly identified subtype-defining genetic alterations. In the pilot study, I have analyzed 1,988 RNA-
seq samples and defined 23 distinct B-ALL subtypes, with 8 novel ones identified. Besides the ones defined by
gene rearrangements, I also observed point mutations on key transcription factors could play potent role in
defining novel subtypes, which include PAX5 P80R (n=44) and IKZF1 N159Y (n=8). In this proposal, I will expand
the sample size and interrogate the rest B-other cases with WGS to define the residual novel subtypes. Through
this study, I will provide definitive B-ALL subtypes and maximize the potential of defining new ones from B-other
cases. As an exemplar of single-point-mutation-defined subtype, PAX5 P80R will be thoroughly studied in this
proposal. Specifically, I will use PAX5 plus other key activating/repressing chromatin marks through ChIP-seq to
study PAX5 P80R specific binding sites, coupled with the chromatin accessibility information from ATAC-seq.
With the CRISPR/Cas9 knock-in Pax5 P80R mouse model, I will use single-cell sequencing of preleukemic and
leukemic B cells to elucidate the correlation between genetic alterations and deregulated genes on cellular level.
Moreover, the markedly overexpressed gene MEGF10 (Multiple Epidermal Growth Factor-Like Domains Protein
10) in PAX5 P80R group will be explored through in vitro and ex vivo models to test its role in cellular localization
and leukemogenesis. Knock-down or -out of MEGF10 through RNAi or CRISPR will be applied in human P80R
xenografts to test if MEGF10 could be a potential target for tailored therapy. The mentored phase of this proposal
will occur at St. Jude Children’s Research Hospital, under Dr. Charles Mullighan, and will finish the aim of
characterizing novel B-ALL subtypes. The independent phase will focus on the functional studies of PAX5 P80R
or other under-studied subtypes. The institutional resources and academic environment and the planned courses
outlined in my proposal will ensure my successful transition to independence.