ABSTRACT/PROJECT SUMMARY
The non-coding (nc) transcriptome remains an under-explored landscape for functional genomics. Recently,
~2,000 long non-coding (lnc)RNAs were identified by the Steitz lab upon exposure of human cells to stress, such
as heat, high salt and oxidative stress, while others have confirmed their induction in viral infection, cancer and
aging. Called DoGs for “Downstream of Gene” transcripts, these lncRNAs result when RNA polymerase II fails
to cleave nascent RNA 3' ends at the annotated site for a subset of protein-coding genes that we term “parent
genes”. Instead, transcription continues from 5 to 45 kbps further downstream, and DoGs are retained in the
nucleus. DoG RNAs are expressed on the timescale of minutes upon stress, suggesting they are among the “first-
responders” to help cells survive. Total DoGs account for 15%–30% of all intergenic transcripts, yet they are not
even annotated in the human genome. Taken together, these features define an urgent need to determine the
sequence and function of DoG RNAs, which are central goals of this proposal. In Aim 1, we propose to sequence
individual DoGs from their 5' to 3' ends, using emerging long read sequencing methodology established for
polyA+ and polyA- RNA in the Neugebauer lab. We will exploit physiological stresses to induce DoGs by orders
of magnitude and optimize library preparation on several platforms to achieve the appropriate sequencing
length and depth for all of the parameters we aim to quantify. The data will reveal the actual lengths, 5' and 3'
ends and the extent to which DoG RNAs are spliced, modified and polyadenylated. Importantly, we will test
our working hypotheses based on preliminary results that splicing and histone post-translational modifications
play mechanistic roles in DoG biogenesis. These findings will give us the first concrete clues regarding the
cellular machineries impinged upon by stress pathways. In Aim 2, we propose concurrent functional analyses of
DoGs that exploit our recent preliminary finding that DoG production by the mouse interferon-ß gene enhances
subsequent expression of interferon-ß upon exposure to polyIC (mimic of viral infection). Therefore, we will
ask whether other DoGs likewise prime expression of their parent genes upon exposure to a second stress. We
will pursue other preliminary results suggesting that DoG parent genes are associated with transcriptional
repression and that DoG production has the potential to up- or down-regulate the parent gene. We will probe
the mechanism of action of DoGs through analyses of transcription elongation and the chromatin landscape in
DoG gene regions with new and published ChIP data. Finally, determination of DoG half-lives before, during
and after stress will allow us to explore the conceptually novel possibility that DoGs are repositories for
unprocessed pre-mRNAs that are later matured to become active mRNAs during recovery from stress. The
achievement of these aims will illuminate the sequences and function(s) of an entirely new class of ncRNA, as
well as the gene regions and chromatin environments where transcriptional activity is regulated by cellular
stresses. Moreover, entirely novel lncRNA-mediated pathways of gene regulation are likely to be identified.