[edit]
Inference of RNA Polymerase II Transcription Dynamics from Chromatin Immunoprecipitation Time Course Data
PLoS Computat Biol, 10(5), 2014.
Abstract
Gene transcription mediated by RNA polymerase II (pol-II) is a key step in gene expression.
The dynamics of pol-II moving along the transcribed region influence the rate and timing of gene
expression. In this work, we present a probabilistic model of transcription dynamics which is fitted
to pol-II occupancy time course data measured using ChIP-Seq. The model can be used to estimate
transcription speed and to infer the temporal pol-II activity profile at the gene promoter. Model
parameters are estimated using either maximum likelihood estimation or via Bayesian inference
using Markov chain Monte Carlo sampling. The Bayesian approach provides confidence intervals for
parameter estimates and allows the use of priors that capture domain knowledge, e.g. the expected
range of transcription speeds, based on previous experiments. The model describes the movement of
pol-II down the gene body and can be used to identify the time of induction for transcriptionally
engaged genes. By clustering the inferred promoter activity time profiles, we are able to determine
which genes respond quickly to stimuli and group genes that share activity profiles and may therefore
be co-regulated. We apply our methodology to biological data obtained using ChIP-seq to measure
pol-II occupancy genome-wide when MCF-7 human breast cancer cells are treated with estradiol (E2).
The transcription speeds we obtain agree with those obtained previously for smaller numbers of genes
with the advantage that our approach can be applied genome-wide. We validate the biological
significance of the pol-II promoter activity clusters by investigating cluster-specific transcription
factor binding patterns and determining canonical pathway enrichment. We find that rapidly induced
genes are enriched for both estrogen receptor alpha (ER) and FOXA1 binding in their proximal promoter
regions.