Accurate modeling of confounding variation in eQTL studies leads to a great increase in power to detect trans-regulatory effects

[edit]

Nicoló Fusi, Microsoft Research, New England
Oliver Stegle, European Bioinformatics Institute
Neil D. Lawrence, University of Sheffield

Related Material

Abstract

Expression quantitative trait loci (eQTL) studies are an integral tool to investigate the genetic component of gene expression variation. A major challenge in the analysis of such studies are hidden confounding factors, such as unobserved covariates or unknown environmental influences. These factors can induce a pronounced artifactual correlation structure in the expression profiles, which may create spurious false associations or mask real genetic association signals.\ \ Here, we report PANAMA (Probabilistic ANAlysis of genoMic dAta), a novel probabilistic model to account for confounding factors within an eQTL analysis. In contrast to previous methods, PANAMA learns hidden factors jointly with the effect of prominent genetic regulators. As a result, PANAMA can more accurately distinguish between true genetic association signals and confounding variation.\ \ We applied our model and compared it to existing methods on a variety of datasets and biological systems. PANAMA consistently performs better than alternative methods, and finds in particular substantially more trans regulators. Importantly, PANAMA not only identified a greater number of associations, but also yields hits that are biologically more plausible and can be better reproduced between independent studies.


@TechReport{fusi-accurate11,
  title = 	 {Accurate modeling of confounding variation in eQTL studies leads to a great increase in power to detect trans-regulatory effects},
  author = 	 {Nicoló Fusi and Oliver Stegle and Neil D. Lawrence},
  year = 	 {2011},
  institution = 	 {Nature Precedings},
  month = 	 {00},
  edit = 	 {https://github.com/lawrennd//publications/edit/gh-pages/_posts/2011-01-01-fusi-accurate11.md},
  url =  	 {http://inverseprobability.com/publications/fusi-accurate11.html},
  abstract = 	 {Expression quantitative trait loci (eQTL) studies are an integral tool to investigate the genetic component of gene expression variation. A major challenge in the analysis of such studies are hidden confounding factors, such as unobserved covariates or unknown environmental influences. These factors can induce a pronounced artifactual correlation structure in the expression profiles, which may create spurious false associations or mask real genetic association signals.\
\
Here, we report PANAMA (Probabilistic ANAlysis of genoMic dAta), a novel probabilistic model to account for confounding factors within an eQTL analysis. In contrast to previous methods, PANAMA learns hidden factors jointly with the effect of prominent genetic regulators. As a result, PANAMA can more accurately distinguish between true genetic association signals and confounding variation.\
\
We applied our model and compared it to existing methods on a variety of datasets and biological systems. PANAMA consistently performs better than alternative methods, and finds in particular substantially more trans regulators. Importantly, PANAMA not only identified a greater number of associations, but also yields hits that are biologically more plausible and can be better reproduced between independent studies.},
  key = 	 {Fusi:accurate11},
  doi = 	 {10101/npre.2011.5995.1},
  OPTgroup = 	 {}
 

}
%T Accurate modeling of confounding variation in eQTL studies leads to a great increase in power to detect trans-regulatory effects
%A Nicoló Fusi and Oliver Stegle and Neil D. Lawrence
%B 
%D 
%F fusi-accurate11	
%P --
%R 10101/npre.2011.5995.1
%U http://inverseprobability.com/publications/fusi-accurate11.html
%X Expression quantitative trait loci (eQTL) studies are an integral tool to investigate the genetic component of gene expression variation. A major challenge in the analysis of such studies are hidden confounding factors, such as unobserved covariates or unknown environmental influences. These factors can induce a pronounced artifactual correlation structure in the expression profiles, which may create spurious false associations or mask real genetic association signals.\
\
Here, we report PANAMA (Probabilistic ANAlysis of genoMic dAta), a novel probabilistic model to account for confounding factors within an eQTL analysis. In contrast to previous methods, PANAMA learns hidden factors jointly with the effect of prominent genetic regulators. As a result, PANAMA can more accurately distinguish between true genetic association signals and confounding variation.\
\
We applied our model and compared it to existing methods on a variety of datasets and biological systems. PANAMA consistently performs better than alternative methods, and finds in particular substantially more trans regulators. Importantly, PANAMA not only identified a greater number of associations, but also yields hits that are biologically more plausible and can be better reproduced between independent studies.
TY  - CPAPER
TI  - Accurate modeling of confounding variation in eQTL studies leads to a great increase in power to detect trans-regulatory effects
AU  - Nicoló Fusi
AU  - Oliver Stegle
AU  - Neil D. Lawrence
PY  - 2011/01/01
DA  - 2011/01/01	
ID  - fusi-accurate11	
SP  - 
EP  - 
DO  - 10101/npre.2011.5995.1
UR  - http://inverseprobability.com/publications/fusi-accurate11.html
AB  - Expression quantitative trait loci (eQTL) studies are an integral tool to investigate the genetic component of gene expression variation. A major challenge in the analysis of such studies are hidden confounding factors, such as unobserved covariates or unknown environmental influences. These factors can induce a pronounced artifactual correlation structure in the expression profiles, which may create spurious false associations or mask real genetic association signals.\
\
Here, we report PANAMA (Probabilistic ANAlysis of genoMic dAta), a novel probabilistic model to account for confounding factors within an eQTL analysis. In contrast to previous methods, PANAMA learns hidden factors jointly with the effect of prominent genetic regulators. As a result, PANAMA can more accurately distinguish between true genetic association signals and confounding variation.\
\
We applied our model and compared it to existing methods on a variety of datasets and biological systems. PANAMA consistently performs better than alternative methods, and finds in particular substantially more trans regulators. Importantly, PANAMA not only identified a greater number of associations, but also yields hits that are biologically more plausible and can be better reproduced between independent studies.
ER  -

Fusi, N., Stegle, O. & Lawrence, N.D.. (2011). Accurate modeling of confounding variation in eQTL studies leads to a great increase in power to detect trans-regulatory effects.:-