# Probe-level Measurement Error Improves Accuracy in Detecting Differential Gene Expression

Xuejun Liu, Nanjing University of Aeronautics and Astronautics
Marta Milo, University of Sheffield
Neil D. Lawrence, University of Sheffield
Magnus Rattray, University of Manchester

Bioinformatics 22, pp 2107-2113

### Errata

• The error bars in Figure 2 were incorrectly calculated. The posterior variance was used instead of the posterior standard deviation to compute the credibility intervals. The ranking was correct in both plots. Here is the revised figure: \url{http://www.bioinf.manchester.ac.uk/resources/puma/intervals.pdf}. There are now 7 false positives in to top 50 genes ranked by differential expression in the left-hand plot. None of the main quantitative results in the paper (i.e. PPLR values, ROC plots, AUC scores) are affected. The main point is that ranking by differential expression alone leads to many false positives while using the PPLR criterion will greatly reduce the number of false positives. This conclusion remains valid.
Thanks to: Richard Pearson

#### Abstract

Motivation: Finding differentially expressed genes is a fundamental objective of a microarray experiment. Numerous methods have been proposed to perform this task. Existing methods are based on point estimates of gene expression level obtained from each microarray experiment. This approach discards potentially useful information about measurement error that can be obtained from an appropriate probe-level analysis. Probabilistic probe-level models can be used to measure gene expression and also provide a level of uncertainty in this measurement. This probe-level variance provides useful information which can help in the identification of differentially expressed genes.\ \ Results: We propose a Bayesian method to include probe-level variances into the detection of differentially expressed genes from replicated experiments. A variational approximation is used for effcient parameter estimation. We compare this approximation with MAP and MCMC parameter estimation in terms of computational effciency and accuracy. The method is used to calculate the probability of positive log-ratio (PPLR) of expression levels between conditions. Using the measurements from a recently developed Affymetrix probe-level model, multi-mgMOS, we test PPLR on a spike-in data set and a mouse time-course data set. Results show that the inclusion of probelevel measurement error improves accuracy in detecting differential gene expression.\ \ Availability: The methods described in this paper have been implemented in an R package pplr that is currently available from http://umber.sbs.man.ac.uk/resources/puma.\ \ Contact: Magnus Rattray

  @Article{liu-variances06, title = {Probe-level Measurement Error Improves Accuracy in Detecting Differential Gene Expression}, journal = {Bioinformatics}, author = {Xuejun Liu and Marta Milo and Neil D. Lawrence and Magnus Rattray}, pages = {2107}, year = {2006}, volume = {22}, number = {17}, month = {00}, edit = {https://github.com/lawrennd//publications/edit/gh-pages/_posts/2006-01-01-liu-variances06.md}, url = {http://inverseprobability.com/publications/liu-variances06.html}, abstract = {**Motivation:** Finding differentially expressed genes is a fundamental objective of a microarray experiment. Numerous methods have been proposed to perform this task. Existing methods are based on point estimates of gene expression level obtained from each microarray experiment. This approach discards potentially useful information about measurement error that can be obtained from an appropriate probe-level analysis. Probabilistic probe-level models can be used to measure gene expression and also provide a level of uncertainty in this measurement. This probe-level variance provides useful information which can help in the identification of differentially expressed genes.\ \ **Results:** We propose a Bayesian method to include probe-level variances into the detection of differentially expressed genes from replicated experiments. A variational approximation is used for effcient parameter estimation. We compare this approximation with MAP and MCMC parameter estimation in terms of computational effciency and accuracy. The method is used to calculate the probability of positive log-ratio (PPLR) of expression levels between conditions. Using the measurements from a recently developed Affymetrix probe-level model, multi-mgMOS, we test PPLR on a spike-in data set and a mouse time-course data set. Results show that the inclusion of probelevel measurement error improves accuracy in detecting differential gene expression.\ \ **Availability:** The methods described in this paper have been implemented in an R package *pplr* that is currently available from .\ \ **Contact:** Magnus Rattray}, key = {Liu-variances06}, doi = {10.1093/bioinformatics/btl361}, linkpdf = {http://bioinformatics.oxfordjournals.org/cgi/reprint/btl361v1.pdf}, group = {shefml,puma,gene networks} }
 %T Probe-level Measurement Error Improves Accuracy in Detecting Differential Gene Expression %A Xuejun Liu and Marta Milo and Neil D. Lawrence and Magnus Rattray %B %C Bioinformatics %D %F liu-variances06 %J Bioinformatics %P 2107--2113 %R 10.1093/bioinformatics/btl361 %U http://inverseprobability.com/publications/liu-variances06.html %V 22 %N 17 %X **Motivation:** Finding differentially expressed genes is a fundamental objective of a microarray experiment. Numerous methods have been proposed to perform this task. Existing methods are based on point estimates of gene expression level obtained from each microarray experiment. This approach discards potentially useful information about measurement error that can be obtained from an appropriate probe-level analysis. Probabilistic probe-level models can be used to measure gene expression and also provide a level of uncertainty in this measurement. This probe-level variance provides useful information which can help in the identification of differentially expressed genes.\ \ **Results:** We propose a Bayesian method to include probe-level variances into the detection of differentially expressed genes from replicated experiments. A variational approximation is used for effcient parameter estimation. We compare this approximation with MAP and MCMC parameter estimation in terms of computational effciency and accuracy. The method is used to calculate the probability of positive log-ratio (PPLR) of expression levels between conditions. Using the measurements from a recently developed Affymetrix probe-level model, multi-mgMOS, we test PPLR on a spike-in data set and a mouse time-course data set. Results show that the inclusion of probelevel measurement error improves accuracy in detecting differential gene expression.\ \ **Availability:** The methods described in this paper have been implemented in an R package *pplr* that is currently available from .\ \ **Contact:** Magnus Rattray 
 TY - CPAPER TI - Probe-level Measurement Error Improves Accuracy in Detecting Differential Gene Expression AU - Xuejun Liu AU - Marta Milo AU - Neil D. Lawrence AU - Magnus Rattray PY - 2006/01/01 DA - 2006/01/01 ID - liu-variances06 SP - 2107 EP - 2113 DO - 10.1093/bioinformatics/btl361 L1 - http://bioinformatics.oxfordjournals.org/cgi/reprint/btl361v1.pdf UR - http://inverseprobability.com/publications/liu-variances06.html AB - **Motivation:** Finding differentially expressed genes is a fundamental objective of a microarray experiment. Numerous methods have been proposed to perform this task. Existing methods are based on point estimates of gene expression level obtained from each microarray experiment. This approach discards potentially useful information about measurement error that can be obtained from an appropriate probe-level analysis. Probabilistic probe-level models can be used to measure gene expression and also provide a level of uncertainty in this measurement. This probe-level variance provides useful information which can help in the identification of differentially expressed genes.\ \ **Results:** We propose a Bayesian method to include probe-level variances into the detection of differentially expressed genes from replicated experiments. A variational approximation is used for effcient parameter estimation. We compare this approximation with MAP and MCMC parameter estimation in terms of computational effciency and accuracy. The method is used to calculate the probability of positive log-ratio (PPLR) of expression levels between conditions. Using the measurements from a recently developed Affymetrix probe-level model, multi-mgMOS, we test PPLR on a spike-in data set and a mouse time-course data set. Results show that the inclusion of probelevel measurement error improves accuracy in detecting differential gene expression.\ \ **Availability:** The methods described in this paper have been implemented in an R package *pplr* that is currently available from .\ \ **Contact:** Magnus Rattray ER - 
 Liu, X., Milo, M., Lawrence, N.D. & Rattray, M.. (2006). Probe-level Measurement Error Improves Accuracy in Detecting Differential Gene Expression. Bioinformatics 22(17):2107-2113