Question: DESeq2 - PCAplot differs between rlog and vsd transformation
0
2.2 years ago by
Jane Merlevede90 wrote:

Dear all,

I am using DESeq2 on 63 RNASeq samples.

The design used to explain the counts is based on the mutational status of 2 genes: ~GroupGene1+GroupGene2

I performed 3 PCA on this dataset:

1) on all the genes with rowSums(counts(dds)) > 10, that are in my case 36276 genes.

The PCA are similar between rlog, vsd and log2 transformation of the data

2&3) on the 2000 and 500 most variable genes, selected with ntop parameter from PCAplot().

Here the PCA are very different between rlog and vsd. With rlog, 2 samples are very large outliers, so the 61 other samples look pretty similar. With vsd, these 2 samples are not outliers.

Do it exist cases where rlog is not working ?

And have some of you already encountered this problem?

modified 2.2 years ago • written 2.2 years ago by Jane Merlevede90
Answer: DESeq2 - PCAplot differs between rlog and vsd transformation
1
2.2 years ago by
Michael Love25k
United States
Michael Love25k wrote:
I'd go with VST when there are many samples. Is there anything special with these 2 samples? Size factor very low?
Answer: DESeq2 - PCAplot differs between rlog and vsd transformation
0
2.2 years ago by
Jane Merlevede90 wrote:

Thank you for your response Michael.

These 2 samples have not the lowest sizeFactor and there is no outlier among sizeFactors:

 summary(sizeFactors(dds))
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
0.5952  0.8673  1.0220  1.0290  1.1430  1.5610

I'd go with vst because it is faster but I don't know the rationale for choosing it over rld.

And for now, I have noticed nothing special about these samples (no bad RIN, no specific tumor cases, ...)

1
I prefer VST when there are many samples. The rlog seemed to outperform (according to our simulations performed in the DESeq2 paper) when there were very large differences in size factor (e.g. spanning an order of magnitude from low to high seq depth).