Question: DESeq2 - PCAplot differs between rlog and vsd transformation
0
gravatar for Jane Merlevede
2.2 years ago by
Jane Merlevede90 wrote:

Dear all,

 

I am using DESeq2 on 63 RNASeq samples.

The design used to explain the counts is based on the mutational status of 2 genes: ~GroupGene1+GroupGene2

I performed 3 PCA on this dataset:

1) on all the genes with rowSums(counts(dds)) > 10, that are in my case 36276 genes.

The PCA are similar between rlog, vsd and log2 transformation of the data

 

2&3) on the 2000 and 500 most variable genes, selected with ntop parameter from PCAplot().

Here the PCA are very different between rlog and vsd. With rlog, 2 samples are very large outliers, so the 61 other samples look pretty similar. With vsd, these 2 samples are not outliers.

 

Do it exist cases where rlog is not working ?

And have some of you already encountered this problem?

 

Thank you in advance

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Jane Merlevede90
Answer: DESeq2 - PCAplot differs between rlog and vsd transformation
1
gravatar for Michael Love
2.2 years ago by
Michael Love25k
United States
Michael Love25k wrote:
I'd go with VST when there are many samples. Is there anything special with these 2 samples? Size factor very low?
ADD COMMENTlink written 2.2 years ago by Michael Love25k
Answer: DESeq2 - PCAplot differs between rlog and vsd transformation
0
gravatar for Jane Merlevede
2.2 years ago by
Jane Merlevede90 wrote:

Thank you for your response Michael.

These 2 samples have not the lowest sizeFactor and there is no outlier among sizeFactors:

 summary(sizeFactors(dds))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
 0.5952  0.8673  1.0220  1.0290  1.1430  1.5610

I'd go with vst because it is faster but I don't know the rationale for choosing it over rld.

 

And for now, I have noticed nothing special about these samples (no bad RIN, no specific tumor cases, ...)

ADD COMMENTlink written 2.2 years ago by Jane Merlevede90
1
I prefer VST when there are many samples. The rlog seemed to outperform (according to our simulations performed in the DESeq2 paper) when there were very large differences in size factor (e.g. spanning an order of magnitude from low to high seq depth).
ADD REPLYlink written 2.2 years ago by Michael Love25k

Ok, thank you

ADD REPLYlink written 2.2 years ago by Jane Merlevede90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 157 users visited in the last hour