Accurate modeling of confounding variation in eQTL studies leads to a great increase in power to detect trans-regulatory effects

Nicoló FusiOliver StegleNeil D. Lawrence
, 2011.

Abstract

Expression quantitative trait loci (eQTL) studies are an integral tool to investigate the genetic component of gene expression variation. A major challenge in the analysis of such studies are hidden confounding factors, such as unobserved covariates or unknown environmental influences. These factors can induce a pronounced artifactual correlation structure in the expression profiles, which may create spurious false associations or mask real genetic association signals. Here, we report PANAMA (Probabilistic ANAlysis of genoMic dAta), a novel probabilistic model to account for confounding factors within an eQTL analysis. In contrast to previous methods, PANAMA learns hidden factors jointly with the effect of prominent genetic regulators. As a result, PANAMA can more accurately distinguish between true genetic association signals and confounding variation. We applied our model and compared it to existing methods on a variety of datasets and biological systems. PANAMA consistently performs better than alternative methods, and finds in particular substantially more trans regulators. Importantly, PANAMA not only identified a greater number of associations, but also yields hits that are biologically more plausible and can be better reproduced between independent studies.

Cite this Paper


BibTeX
@Misc{Fusi-accurate11, title = {Accurate modeling of confounding variation in {eQTL} studies leads to a great increase in power to detect trans-regulatory effects}, author = {Fusi, Nicoló and Stegle, Oliver and Lawrence, Neil D.}, year = {2011}, doi = {10.1038/npre.2011.5995.1}, pdf = {https://www.nature.com/articles/npre.2011.5995.1.pdf}, url = {http://inverseprobability.com/publications/fusi-accurate11.html}, abstract = {Expression quantitative trait loci (eQTL) studies are an integral tool to investigate the genetic component of gene expression variation. A major challenge in the analysis of such studies are hidden confounding factors, such as unobserved covariates or unknown environmental influences. These factors can induce a pronounced artifactual correlation structure in the expression profiles, which may create spurious false associations or mask real genetic association signals. Here, we report PANAMA (Probabilistic ANAlysis of genoMic dAta), a novel probabilistic model to account for confounding factors within an eQTL analysis. In contrast to previous methods, PANAMA learns hidden factors jointly with the effect of prominent genetic regulators. As a result, PANAMA can more accurately distinguish between true genetic association signals and confounding variation. We applied our model and compared it to existing methods on a variety of datasets and biological systems. PANAMA consistently performs better than alternative methods, and finds in particular substantially more trans regulators. Importantly, PANAMA not only identified a greater number of associations, but also yields hits that are biologically more plausible and can be better reproduced between independent studies.} }
Endnote
%0 Generic %T Accurate modeling of confounding variation in eQTL studies leads to a great increase in power to detect trans-regulatory effects %A Nicoló Fusi %A Oliver Stegle %A Neil D. Lawrence %D 2011 %F Fusi-accurate11 %R 10.1038/npre.2011.5995.1 %U http://inverseprobability.com/publications/fusi-accurate11.html %X Expression quantitative trait loci (eQTL) studies are an integral tool to investigate the genetic component of gene expression variation. A major challenge in the analysis of such studies are hidden confounding factors, such as unobserved covariates or unknown environmental influences. These factors can induce a pronounced artifactual correlation structure in the expression profiles, which may create spurious false associations or mask real genetic association signals. Here, we report PANAMA (Probabilistic ANAlysis of genoMic dAta), a novel probabilistic model to account for confounding factors within an eQTL analysis. In contrast to previous methods, PANAMA learns hidden factors jointly with the effect of prominent genetic regulators. As a result, PANAMA can more accurately distinguish between true genetic association signals and confounding variation. We applied our model and compared it to existing methods on a variety of datasets and biological systems. PANAMA consistently performs better than alternative methods, and finds in particular substantially more trans regulators. Importantly, PANAMA not only identified a greater number of associations, but also yields hits that are biologically more plausible and can be better reproduced between independent studies.
RIS
TY - GEN TI - Accurate modeling of confounding variation in eQTL studies leads to a great increase in power to detect trans-regulatory effects AU - Nicoló Fusi AU - Oliver Stegle AU - Neil D. Lawrence DA - 2011/06/02 ID - Fusi-accurate11 DO - 10.1038/npre.2011.5995.1 L1 - https://www.nature.com/articles/npre.2011.5995.1.pdf UR - http://inverseprobability.com/publications/fusi-accurate11.html AB - Expression quantitative trait loci (eQTL) studies are an integral tool to investigate the genetic component of gene expression variation. A major challenge in the analysis of such studies are hidden confounding factors, such as unobserved covariates or unknown environmental influences. These factors can induce a pronounced artifactual correlation structure in the expression profiles, which may create spurious false associations or mask real genetic association signals. Here, we report PANAMA (Probabilistic ANAlysis of genoMic dAta), a novel probabilistic model to account for confounding factors within an eQTL analysis. In contrast to previous methods, PANAMA learns hidden factors jointly with the effect of prominent genetic regulators. As a result, PANAMA can more accurately distinguish between true genetic association signals and confounding variation. We applied our model and compared it to existing methods on a variety of datasets and biological systems. PANAMA consistently performs better than alternative methods, and finds in particular substantially more trans regulators. Importantly, PANAMA not only identified a greater number of associations, but also yields hits that are biologically more plausible and can be better reproduced between independent studies. ER -
APA
Fusi, N., Stegle, O. & Lawrence, N.D.. (2011). Accurate modeling of confounding variation in eQTL studies leads to a great increase in power to detect trans-regulatory effects. doi:10.1038/npre.2011.5995.1 Available from http://inverseprobability.com/publications/fusi-accurate11.html.

Related Material