DESeq2 with high range of within-group variability
Tash. • 0
Hi there,

I'm struggling to decide whether I should split up my groups or use the contrast argument of the results function to extract the comparisons after fitting the model (as explained in the vignette). As you can see in the plot linked below, in the LI-P-L and LI-NP groups, there are a few individuals that don't cluster within their groups.

PCA Sample Type

Biologically, I'm interested in comparing:

1. LI-P-L vs LI-NP
2. LI-P-L vs HV
3. LI-P-L vs LI-P-NL
4. LI-P-NL vs HV


Based on the plot, do you think it makes sense to split these into 4 separate matrices, or simply use the contrast function?

Thanks so much for your help!

I can't see the plot, can you?

Hi Michael,

I can see it? It downloads when I click on the link. I'll try again here. !

PCA

@mikelove
I still can't see this file, it's an unrecognized format on my machine.

I guess, if you are worried about too much heterogeneity, just use the split dataset approach.

Hi again Michael,

So sorry, don't know why that doesn't work. Here is an image link instead: PCA

Another option would be to estimate a batch variable using SVA or RUV (see the workflow for example code), and then use this as a blocking variable ~sv1 + condition. For this approach, use the full data to best estimate the batch variable and for the DESeq() analysis.