Which rlog should I use for DESeq2 analysis
2
3
Entering edit mode
@rafaelsolersanblas-22935
Last seen 3 days ago
Alicante

I have a question regarding the rlog normalization.

I have many samples to compare, with only one factor. A vs Treat, B vs Treat, C vs Treat ... So, should I put everything in the same DESeqDataSet object even though the variability between groups is very large (I did the differential expression analysis with the comparisons separated), and then calculate the rlog of all samples?, Or put the different comparisons in different DESeqDataSet objects and extract the rlog of each comparison, and later join the rlogs by EnsemblID?

Thank you!

samples rlog DESeq2 • 213 views
2
Entering edit mode
@mikelove
Last seen 1 day ago
United States

I’d prefer VST. Sometimes the rlog can overshrink differences between groups. You can use vst() function.

0
Entering edit mode

Yes, I read that in the vignette, thank you Professor Love!

However, I still have the doubt of whether to group everything in the same DESeqDataSet or to separate it into different ones. I suppose that to perform data visualization, it is better to put everything together, and for differential analysis to do everything separately, right?

Also, I am seeing if I transform them with blind = F, since between groups I expect great genetic variability (not within the groups themselves). Although if I want to do an unsupervised hierarchical clustering with a z-score of very different tumor samples to cluster them transcriptomically, would you apply blind = T?

Thank you so much!

0
Entering edit mode

The question about "whether to group everything in the same DESeqDataSet or to separate it into different ones" is a FAQ in the vignette.

I recommend blind=FALSE generally. The design is not used in performing the transformation, which is fixed for all samples equally. It is only used to understand the global amount of within-group variability. It will still be unsupervised with blind=FALSE.

1
Entering edit mode
swbarnes2 ▴ 920
@swbarnes2-14086
Last seen 37 minutes ago
San Diego

The rlogged/vst values are not used at all in assessing DE genes. They are provided to use in applications like PCA plots or heatmaps.

In general, it is preferable to keep all your samples in a single object, and use contrasts to specify what subgroups you want to compare.