Question: DESeq2 - selective calculation of differential gene expression
gravatar for LilithElina
16 months ago by
LilithElina0 wrote:


due to a strange error in DESeq my group is finally being forced to switch to DESeq2 (probably a good thing). There is one problem we are not sure how to solve, though:

In the old DESeq, we would normalise the data and then call nbinomTest() for each comparison we wanted to make. Now, the DESeq() function does all in one, which is nice up until you don't want to compare everything with everything else. We have a data frame with over 400 samples and only want to compare them to the wild type, not to each other. I can't find any possibility to do so, though, neither with DESeq() nor with nbinomWaldTest().

Is it in this case recommended to reduce the data set before using DESeq2, for each comparison, or should we try to run the DESeq() function once and then save it to extract only the relevant results?

ADD COMMENTlink modified 16 months ago by Michael Love19k • written 16 months ago by LilithElina0
gravatar for Michael Love
16 months ago by
Michael Love19k
United States
Michael Love19k wrote:

hi Lilith,

Yes, definitely a good thing. DESeq2 is more sensitive than DESeq, and the older version isn't being developed since ~2012/2013.

Can you explain a bit more, how do the 400 samples break down into groups? What are the sample sizes per group? Is there any other relevant structure to the samples?

ADD COMMENTlink written 16 months ago by Michael Love19k

Hey Michael,

I'm afraid they're either single or duplicates (depending on the project). We know replicates are important, but with 400 - 500 clinical isolates it just wasn't feasible. So the problem is that we don't really want to compare more than 400 samples against each other. In the old protocols, we always normalised over all samples at once and then did the comparisons we needed, but that doesn't seem possible any more now.


ADD REPLYlink written 16 months ago by LilithElina0

That's fine, but what does table(condition) look like?

How many WT do you have, which you are comparing against?

ADD REPLYlink written 16 months ago by Michael Love19k

The wild type is also just one sample, apparently, so the conditions df is basically just a df with the sample names as "conditions".

ADD REPLYlink written 16 months ago by LilithElina0

You want to compare each of 400 samples against a single WT sample? I just want to offer the best advice possible, but maybe it would be good if you could explain what kind of data this is? For how many of the 400 samples is there any replication, if only a duplicate? When you have duplicates, is it a technical replicate (just sequencing the same library) or is it a biological replicate? This is all relevant to what kind of advice I would give you. 

The way you can make the comparisons is to run DESeq() once, with WT as the reference level of the condition variable, and then extract results tables for each comparison:

res <- results(dds, contrast=c("condition", "X", "WT"))

where you would fill in "X" with whichever sample you want to compare to WT. 

ADD REPLYlink written 16 months ago by Michael Love19k

In the example that caused me to ask this question we have no replicates at all, not even for the WT. We did RNA-Seq of ~400 clinical isolates (bacteria, grown in the lab, not ex vivo samples) and we're looking at their general transcription profiles as well as differential expression in comparison to the lab-grown WT.

Running DESeq() on all the data and just extracting the results we need seems to work, thank you. Apparently my colleague only looks at a few samples at a time. Do you think using save() or saveRDS() is a good way to keep the DESeq() results available?

ADD REPLYlink written 16 months ago by LilithElina0
Yes, save() the dds object makes sense
ADD REPLYlink written 16 months ago by Michael Love19k

Thank you for your help!

ADD REPLYlink written 16 months ago by LilithElina0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 383 users visited in the last hour