PROJECT SUMMARY/ABSTRACT
Enhancers are short non-coding DNA elements which regulate the expression of genes. They are less
conserved than promoters and genes, and their emergence is an important driver of the evolution of new
phenotypic forms and functions. Mutations in non-coding regulatory sequences cause many human diseases,
including inherited mendelian diseases and cancers. Enhancers often cluster together to confer additive
regulatory specificity to a distal gene; however, it is not known if enhancers communicate and cooperate with
other enhancers over genomic distance. The defining difference between enhancers and promoters is that one
of the two divergent transcription start sites of promoters produces a messenger RNA molecule, while
enhancers only produce non-coding transcripts which are often short and unstable. Precisely defining the
grammar of enhancer activity and cooperativity will facilitate understanding of the basic biology and disease
pathology of humans. Transposable elements are mobile DNA elements which make up ~50% of the human
genome, represent a potent source of inter- and intra-species genetic diversity, and can act as enhancers by
regulating adjacent host genes despite their mutagenic potential. By characterizing a cohort of newly evolved,
TE-derived enhancers with high sequence similarity, we will observe a broad continuum of enhancer activity
dictated by the genomic context and biochemical properties of each enhancer. LTR5HS is a subclass of the
most recently endogenized retroviral element in the human genome, HERV-K, and contains elements which
are polymorphic among humans. LTR5HS elements are transcriptionally active in the human embryo and act
as enhancers controlling the expression of 275 genes in a human embryonic carcinoma-derived cell line. We
will perturb these elements, both simultaneously and individually, and precisely define the impact of these
perturbations on nascent transcription, 3D genome architecture, and phenotype. First, we will use the CARGO
system to introduce of tens of gRNAs into cells, which in combination with the sequence similarity of LTR5HS
elements enables targeting of dCas9 fused to activating or repressing domains to most of the 697 LTR5HS
elements in the human genome. On a temporal axis, we will monitor nascent transcription and long-range
contacts of the elements post-perturbation, and infer the mode of activity and cooperation of subnetworks
depending on the timing and magnitude of these measurements. Second, we will create a pooled deletion
library of all LTR5HS elements and measure the impact of each on nascent transcription using single-cell
technology to capture gRNA sequence along with nascent RNA from single cells, and select against gRNAs
deleterious for growth, differentiation, and pluripotency phenotypes. This highly granular approach will provide
a framework for relating the biochemical properties and genomic context of enhancers to transcriptomic and
phenotypic properties.