How is the mean-variance plot calculated?
1
0
Entering edit mode
neverstop • 0
@neverstop-7227
Last seen 5.4 years ago

I'm trying to understand how the mean-variance plot is calculated with the meanVarPlot() of EDASeq package. I tried to reproduce it manually with the code below but the result is very different.

Given the expression matrix (row=genes, columns=samples), I used the logarithm to transform the counts before computing mean and variance. Since there are zero counts, I considered the  maximum between 0 and the log of the counts. After that I calculated the row means and the row variances (the mean and variance of each gene across all samples).

Here's the code I used:

rm(list=ls())
library(yeastRNASeq)
library(EDASeq)
data(geneLevelData)
Data=newSeqExpressionSet(as.matrix(geneLevelData))
meanVarPlot(Data,log=T)
geneLevelDataM=as.matrix(geneLevelData)

means=apply(pmax(log(geneLevelDataM),0),1,mean)
variances=apply(pmax(log(geneLevelDataM),0),1,var)
plot(means,variances)

means=apply(pmax(log2(geneLevelDataM),0),1,mean)
variances=apply(pmax(log2(geneLevelDataM),0),1,var)
plot(means,variances)

edaseq • 1.8k views
2
Entering edit mode
@wolfgang-huber-3550
Last seen 11 weeks ago
EMBL European Molecular Biology Laborat…

Have a look at the source, e.g. https://github.com/Bioconductor-mirror/EDASeq/blob/master/R/methods-SeqExpressionSet.R (search for "meanVarPlot").

Other ways for accessing the source of a Bioconductor package are: download the .tar.gz from the package homepage and unpack; via the subversion server; via the R command line, you can list each function's body (it's a bit wordy for S4 methods).

It looks like the function does something else than what its manual page says, regarding the log argument.

0
Entering edit mode

The function does indeed do something else than what the manual page says, i.e., it takes the log after computing mean and variance.

Thank you both for spotting this incongruence! I will fix it in devel.

Just to add to Wolfgang's answer re: source code, another way, within R, is to type:

getMethod("meanVarPlot", "SeqExpressionSet")