Model-based Method for Transcription Factor Target Identification with Limited Data

Antti Honkela; Charles Girardot; E. Hilary Gustafson; Ya-Hsin Liu; Eileen E. M. Furlong; Neil D. Lawrence; Magnus Rattray

doi:10.1073/pnas.0914285107

Back to publications

Model-based Method for Transcription Factor Target Identification with Limited Data

Antti Honkela, Charles Girardot, E. Hilary Gustafson, Ya-Hsin Liu, Eileen E. M. Furlong, Neil D. Lawrence, Magnus Rattray

Proc. Natl. Acad. Sci. USA, 107(17):7793-7798, 2010.

Abstract

We present a computational method for identifying potential targets of a transcription factor (TF) using wild-type gene expression time series data. For each putative target gene we fit a simple differential equation model of transcriptional regulation, and the model likelihood serves as a score to rank targets. The expression profile of the TF is modeled as a sample from a Gaussian process prior distribution that is integrated out using a nonparametric Bayesian procedure. This results in a parsimonious model with relatively few parameters that can be applied to short time series datasets without noticeable overfitting. We assess our method using genome-wide chromatin immunoprecipitation (ChIP-chip) and loss-of-function mutant expression data for two TFs, Twist, and Mef2, controlling mesoderm development in Drosophila. Lists of top-ranked genes identified by our method are significantly enriched for genes close to bound regions identified in the ChIP-chip data and for genes that are differentially expressed in loss-of-function mutants. Targets of Twist display diverse expression profiles, and in this case a model-based approach performs significantly better than scoring based on correlation with TF expression. Our approach is found to be comparable or superior to ranking based on mutant differential expression scores. Also, we show how integrating complementary wild-type spatial expression data can further improve target ranking performance.

Links

Cite this Paper

BibTeX


@Article{Honkela-modelbased10,
  title = 	 {Model-based Method for Transcription Factor Target Identification with Limited Data},
  author = 	 {Honkela, Antti and Girardot, Charles and Gustafson, E. Hilary and Liu, Ya-Hsin and Furlong, Eileen E. M. and Lawrence, Neil D. and Rattray, Magnus},
  journal =      {Proc. Natl. Acad. Sci. USA},
  pages = 	 {7793--7798},
  year = 	 {2010},
  volume = 	 {107},
  number =       {17},
  doi = 	 {10.1073/pnas.0914285107},
  pdf = 	 {https://www.pnas.org/content/pnas/107/17/7793.full.pdf},
  url = 	 {/publications/honkela-modelbased10.html},
  abstract = 	 {We present a computational method for identifying potential targets of a transcription factor (TF) using wild-type gene expression time series data. For each putative target gene we fit a simple differential equation model of transcriptional regulation, and the model likelihood serves as a score to rank targets. The expression profile of the TF is modeled as a sample from a Gaussian process prior distribution that is integrated out using a nonparametric Bayesian procedure. This results in a parsimonious model with relatively few parameters that can be applied to short time series datasets without noticeable overfitting. We assess our method using genome-wide chromatin immunoprecipitation (ChIP-chip) and loss-of-function mutant expression data for two TFs, Twist, and Mef2, controlling mesoderm development in Drosophila. Lists of top-ranked genes identified by our method are significantly enriched for genes close to bound regions identified in the ChIP-chip data and for genes that are differentially expressed in loss-of-function mutants. Targets of Twist display diverse expression profiles, and in this case a model-based approach performs significantly better than scoring based on correlation with TF expression. Our approach is found to be comparable or superior to ranking based on mutant differential expression scores. Also, we show how integrating complementary wild-type spatial expression data can further improve target ranking performance.}
}

Endnote

%0 Journal Article
%T Model-based Method for Transcription Factor Target Identification with Limited Data
%A Antti Honkela
%A Charles Girardot
%A E. Hilary Gustafson
%A Ya-Hsin Liu
%A Eileen E. M. Furlong
%A Neil D. Lawrence
%A Magnus Rattray
%J Proc. Natl. Acad. Sci. USA
%D 2010	
%F Honkela-modelbased10
%P 7793--7798
%R 10.1073/pnas.0914285107
%U /publications/honkela-modelbased10.html
%V 107
%N 17
%X We present a computational method for identifying potential targets of a transcription factor (TF) using wild-type gene expression time series data. For each putative target gene we fit a simple differential equation model of transcriptional regulation, and the model likelihood serves as a score to rank targets. The expression profile of the TF is modeled as a sample from a Gaussian process prior distribution that is integrated out using a nonparametric Bayesian procedure. This results in a parsimonious model with relatively few parameters that can be applied to short time series datasets without noticeable overfitting. We assess our method using genome-wide chromatin immunoprecipitation (ChIP-chip) and loss-of-function mutant expression data for two TFs, Twist, and Mef2, controlling mesoderm development in Drosophila. Lists of top-ranked genes identified by our method are significantly enriched for genes close to bound regions identified in the ChIP-chip data and for genes that are differentially expressed in loss-of-function mutants. Targets of Twist display diverse expression profiles, and in this case a model-based approach performs significantly better than scoring based on correlation with TF expression. Our approach is found to be comparable or superior to ranking based on mutant differential expression scores. Also, we show how integrating complementary wild-type spatial expression data can further improve target ranking performance.

RIS


TY  - JOUR
TI  - Model-based Method for Transcription Factor Target Identification with Limited Data
AU  - Antti Honkela
AU  - Charles Girardot
AU  - E. Hilary Gustafson
AU  - Ya-Hsin Liu
AU  - Eileen E. M. Furlong
AU  - Neil D. Lawrence
AU  - Magnus Rattray
DA  - 2010/04/27	
ID  - Honkela-modelbased10
VL  - 107
IS  - 17
SP  - 7793
EP  - 7798
DO  - 10.1073/pnas.0914285107
L1  - https://www.pnas.org/content/pnas/107/17/7793.full.pdf
UR  - /publications/honkela-modelbased10.html
AB  - We present a computational method for identifying potential targets of a transcription factor (TF) using wild-type gene expression time series data. For each putative target gene we fit a simple differential equation model of transcriptional regulation, and the model likelihood serves as a score to rank targets. The expression profile of the TF is modeled as a sample from a Gaussian process prior distribution that is integrated out using a nonparametric Bayesian procedure. This results in a parsimonious model with relatively few parameters that can be applied to short time series datasets without noticeable overfitting. We assess our method using genome-wide chromatin immunoprecipitation (ChIP-chip) and loss-of-function mutant expression data for two TFs, Twist, and Mef2, controlling mesoderm development in Drosophila. Lists of top-ranked genes identified by our method are significantly enriched for genes close to bound regions identified in the ChIP-chip data and for genes that are differentially expressed in loss-of-function mutants. Targets of Twist display diverse expression profiles, and in this case a model-based approach performs significantly better than scoring based on correlation with TF expression. Our approach is found to be comparable or superior to ranking based on mutant differential expression scores. Also, we show how integrating complementary wild-type spatial expression data can further improve target ranking performance.
ER  -

APA


Honkela, A., Girardot, C., Gustafson, E.H., Liu, Y., Furlong, E.E.M., Lawrence, N.D. & Rattray, M.. (2010). Model-based Method for Transcription Factor Target Identification with Limited Data. Proc. Natl. Acad. Sci. USA 107(17):7793-7798 doi:10.1073/pnas.0914285107 Available from /publications/honkela-modelbased10.html.