Hi!
This is a question regarding the limma/voom workflow for analyzing RNA-Seq dataset.
According to the workflow described in "RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR", MDS plotting is performed on the DGEList object which was passed through the filterByExpr
and calcNormFactors
functions. It is performed before executing the voom
function.
If I understood correctly, voom
is removing heteroscedascity from the count data.
In the DESeq2 vignette ("RNA-seq workflow: gene-level exploratory analysis and differential expression") on the other hand, MDS plotting is performed on the vst
/rlog
transformed data, which both remove heteroscedascity too. Here it is mentioned that MDS plotting requires removing heteroscedascity.
So for me, right now, it looks like the limma/voom workflow is in contrast to this statement, since voom
- which removes heteroscedascity - is performed after the MDS plotting.
I hope someone can explain why the two workflows seem to differ here.
Thanks much!
Thanks for your response!
I do understand that MDS plots should be based on "close-to-raw" datasets, meaning before any model fitting is performed. This does not seem to be in contrast to performing MDS plots either on
vst
orvoom
transformed dataset, since both of these functions are applied prior to the model fitting step.Regarding
voom
and the fact that this function is not removing heteroscedasticity: I was assuming thatvoom
is removing heteroscedasticity since the title of the paragraph wherevoom
is applied says exactly this ("Removing heteroscedascity from count data", from: 1-2-3 limma-voom workflow). But this clarification is helpful! Thanks!So at the end, and please correct me if I am wrong,
voom
orvst
are not removing but minimizing heteroscedasticity, and the difference between the workflow of DESeq2 and limma/voom regarding the execution ofvst
/voom
prior to MDS/PCA plotting is based on the fact that variance stabilization is done within thecpm
function.Please let me know if my "take-home-message" is correct.
Thanks!
Yes, the
cpm
and thevst
functions are both transforming the counts to a log scale suitable for a PCA or MDS plot. They have the same aim.In both cases, these transformations are independent of the DE analysis. The output of cpm/vst is not used for the DE analysis, nor is the output of the DE functions like voom used for the MDS plot. So the MDS plot will be the same whether it is done before or after the DE analysis.
I take your point about the title of Section 6.2 of the 1-2-3 workflow. I guess what was meant is that voom removes the mean-variance trend from the SA plot in Figure 4. Also the voom precision weights allow limma to assume that the unknown variances are equal across samples for each gene. It does not mean though that voom produces a new version of the data with all the signal preserved and all heteroscedasticity removed.
Ok. Great. Thanks for your final comment. I think that clarifies the differences between the limma/voom and DESeq2 workflow regarding the MDS plotting.
Thanks much!