Project Summary/Abstract
Drugs are approved by regulators because they are relatively safe and effective. However, there are
typically unanswered basic science questions about the detailed mechanisms of action, the impacts of
genetic and epigenetic variation, and the full range of phenotypic responses. This missing knowledge often
leads to reduced efficacy and increased adverse events. My overall scientific goal is to generate more
complete understanding of the mechanisms of drug response and their sources of variation, in order to
enable more precise drug therapies.
The last two decades has seen an explosion of data relevant to drug response. We have abundant
data about human genetic variation and gene expression (and other omic) profiles that illuminate key cellular
pathways in disease. Advances in protein 3D structure prediction provide useful models for most proteins,
which enable proteome-wide screening for off-target drug interactions. Biobanks, electronic medical records
(EMR), FDA adverse event databases and medical claims data provide clinical information and environmental
exposures. These data have biases and blindspots. My lab has a track record creating methods for analysis
of all these critical data types. We focus on computational/statistical approaches that integrate data at all
scales, thus reducing the biases within individual scales.
In 2000, we created the Pharmacogenetics Knowledgebase (PharmGKB) which curates information
about how human genetic variation influences variation in drug response. PharmGKB has high quality
information for 100s of drugs and genes, but pharmacogenetics typically explains far less than ~50% of
variation in drug response. I hypothesize that a large fraction of the remaining variation can be explained by
unknown off-targets, undiscovered pathways of drug response, genetic and epigenetic differences in
expression, and differences in environment and disease physiology. Thus, my proposed work focuses on
computational methods that use publicly available data to answer five driving questions: (1) What are the full
set of clinical responses to drugs, alone and in combination? (2) What are the molecular targets (particularly
off-targets) that are modulated by a drug? (3) What are the pathways that modulate drug response? (4) How
does genetic variation in targets/pathways lead to variation in drug response? (5) How do epigenetics create
variability in drug response? We will evaluate our methods with independent, held-out gold standard data
sets (to establish quantitative statistical performance), and collaborate with experimental colleagues to
validate key novel hypotheses. We will focus on genes and pathways that are critical in drug response for
under-studied diseases and in under-studied populations.