Genome-wide modeling of transcription kinetics reveals patterns of RNA production delays

[edit]

Antti Honkela, University of Helsinki
Jaakko Peltonen, Aalto University
Hande Topa, Aalto University
Iryna Charapitsa, Institute for Molecular Biology, Mainz
Filomena Matarese, Radboud University
Korbinian Grote, Genomatix Software
Hendrik G. Stunnenberg, Radboud University
George Reid, Institute for Molecular Biology, Mainz
Neil D. Lawrence, University of Sheffield
Magnus Rattray, University of Manchester

Proc. Natl. Acad. Sci. USA 112, pp 13115-13120

Related Material

Abstract

Genes with similar transcriptional activation kinetics can display very different temporal mRNA profiles because of differences in transcription time, degradation rate, and RNA-processing kinetics. Recent studies have shown that a splicing-associated RNA production delay can be significant. To investigate this issue more generally, it is useful to develop methods applicable to genome-wide datasets. We introduce a joint model of transcriptional activation and mRNA accumulation that can be used for inference of transcription rate, RNA production delay, and degradation rate given data from high-throughput sequencing time course experiments. We combine a mechanistic differential equation model with a nonparametric statistical modeling approach allowing us to capture a broad range of activation kinetics, and we use Bayesian parameter estimation to quantify the uncertainty in estimates of the kinetic parameters. We apply the model to data from estrogen receptor α activation in the MCF-7 breast cancer cell line. We use RNA polymerase II ChIP-Seq time course data to characterize transcriptional activation and mRNA-Seq time course data to quantify mature transcripts. We find that 11% of genes with a good signal in the data display a delay of more than 20 min between completing transcription and mature mRNA production. The genes displaying these long delays are significantly more likely to be short. We also find a statistical association between high delay and late intron retention in pre-mRNA data, indicating significant splicing-associated production delays in many genes.


@Article{honkela-genome15,
  title = 	 {Genome-wide modeling of transcription kinetics reveals patterns of {RNA} production delays},
  journal =  	 {Proc. Natl. Acad. Sci. USA},
  author = 	 {Antti Honkela and Jaakko Peltonen and Hande Topa and Iryna Charapitsa and Filomena Matarese and Korbinian Grote and Hendrik G. Stunnenberg and George Reid and Neil D. Lawrence and Magnus Rattray},
  pages = 	 {13115},
  year = 	 {2015},
  volume = 	 {112},
  number =       {42},
  month = 	 {00},
  edit = 	 {https://github.com/lawrennd//publications/edit/gh-pages/_posts/2015-10-05-honkela-genome15.md},
  url =  	 {http://inverseprobability.com/publications/honkela-genome15.html},
  pdf = 	 {http://www.pnas.org/content/112/42/13115.full.pdf},
  abstract = 	 {Genes with similar transcriptional activation kinetics can display very different temporal mRNA profiles because of differences in transcription time, degradation rate, and RNA-processing kinetics. Recent studies have shown that a splicing-associated RNA production delay can be significant. To investigate this issue more generally, it is useful to develop methods applicable to genome-wide datasets. We introduce a joint model of transcriptional activation and mRNA accumulation that can be used for inference of transcription rate, RNA production delay, and degradation rate given data from high-throughput sequencing time course experiments. We combine a mechanistic differential equation model with a nonparametric statistical modeling approach allowing us to capture a broad range of activation kinetics, and we use Bayesian parameter estimation to quantify the uncertainty in estimates of the kinetic parameters. We apply the model to data from estrogen receptor α activation in the MCF-7 breast cancer cell line. We use RNA polymerase II ChIP-Seq time course data to characterize transcriptional activation and mRNA-Seq time course data to quantify mature transcripts. We find that 11% of genes with a good signal in the data display a delay of more than 20 min between completing transcription and mature mRNA production. The genes displaying these long delays are significantly more likely to be short. We also find a statistical association between high delay and late intron retention in pre-mRNA data, indicating significant splicing-associated production delays in many genes.},
  key = 	 {Honkela-genome15},
  note = 	 {In press},
  doi = 	 {10.1073/pnas.1420404112},
  OPTgroup = 	 {}
 

}
%T Genome-wide modeling of transcription kinetics reveals patterns of RNA production delays
%A Antti Honkela and Jaakko Peltonen and Hande Topa and Iryna Charapitsa and Filomena Matarese and Korbinian Grote and Hendrik G. Stunnenberg and George Reid and Neil D. Lawrence and Magnus Rattray
%B 
%C Proc. Natl. Acad. Sci. USA
%D 
%F honkela-genome15
%J Proc. Natl. Acad. Sci. USA	
%P 13115--13120
%R 10.1073/pnas.1420404112
%U http://inverseprobability.com/publications/honkela-genome15.html
%V 112
%N 42
%X Genes with similar transcriptional activation kinetics can display very different temporal mRNA profiles because of differences in transcription time, degradation rate, and RNA-processing kinetics. Recent studies have shown that a splicing-associated RNA production delay can be significant. To investigate this issue more generally, it is useful to develop methods applicable to genome-wide datasets. We introduce a joint model of transcriptional activation and mRNA accumulation that can be used for inference of transcription rate, RNA production delay, and degradation rate given data from high-throughput sequencing time course experiments. We combine a mechanistic differential equation model with a nonparametric statistical modeling approach allowing us to capture a broad range of activation kinetics, and we use Bayesian parameter estimation to quantify the uncertainty in estimates of the kinetic parameters. We apply the model to data from estrogen receptor α activation in the MCF-7 breast cancer cell line. We use RNA polymerase II ChIP-Seq time course data to characterize transcriptional activation and mRNA-Seq time course data to quantify mature transcripts. We find that 11% of genes with a good signal in the data display a delay of more than 20 min between completing transcription and mature mRNA production. The genes displaying these long delays are significantly more likely to be short. We also find a statistical association between high delay and late intron retention in pre-mRNA data, indicating significant splicing-associated production delays in many genes.
TY  - CPAPER
TI  - Genome-wide modeling of transcription kinetics reveals patterns of RNA production delays
AU  - Antti Honkela
AU  - Jaakko Peltonen
AU  - Hande Topa
AU  - Iryna Charapitsa
AU  - Filomena Matarese
AU  - Korbinian Grote
AU  - Hendrik G. Stunnenberg
AU  - George Reid
AU  - Neil D. Lawrence
AU  - Magnus Rattray
PY  - 2015/10/20
DA  - 2015/10/20	
ID  - honkela-genome15	
SP  - 13115
EP  - 13120
DO  - 10.1073/pnas.1420404112
L1  - http://www.pnas.org/content/112/42/13115.full.pdf
UR  - http://inverseprobability.com/publications/honkela-genome15.html
AB  - Genes with similar transcriptional activation kinetics can display very different temporal mRNA profiles because of differences in transcription time, degradation rate, and RNA-processing kinetics. Recent studies have shown that a splicing-associated RNA production delay can be significant. To investigate this issue more generally, it is useful to develop methods applicable to genome-wide datasets. We introduce a joint model of transcriptional activation and mRNA accumulation that can be used for inference of transcription rate, RNA production delay, and degradation rate given data from high-throughput sequencing time course experiments. We combine a mechanistic differential equation model with a nonparametric statistical modeling approach allowing us to capture a broad range of activation kinetics, and we use Bayesian parameter estimation to quantify the uncertainty in estimates of the kinetic parameters. We apply the model to data from estrogen receptor α activation in the MCF-7 breast cancer cell line. We use RNA polymerase II ChIP-Seq time course data to characterize transcriptional activation and mRNA-Seq time course data to quantify mature transcripts. We find that 11% of genes with a good signal in the data display a delay of more than 20 min between completing transcription and mature mRNA production. The genes displaying these long delays are significantly more likely to be short. We also find a statistical association between high delay and late intron retention in pre-mRNA data, indicating significant splicing-associated production delays in many genes.
ER  -

Honkela, A., Peltonen, J., Topa, H., Charapitsa, I., Matarese, F., Grote, K., Stunnenberg, H.G., Reid, G., Lawrence, N.D. & Rattray, M.. (2015). Genome-wide modeling of transcription kinetics reveals patterns of RNA production delays. Proc. Natl. Acad. Sci. USA 112(42):13115-13120