DESeq2 - selective calculation of differential gene expression
1
0
Entering edit mode
@lilithelina-13162
Last seen 6.6 years ago
Germany

Hi,

due to a strange error in DESeq my group is finally being forced to switch to DESeq2 (probably a good thing). There is one problem we are not sure how to solve, though:

In the old DESeq, we would normalise the data and then call nbinomTest() for each comparison we wanted to make. Now, the DESeq() function does all in one, which is nice up until you don't want to compare everything with everything else. We have a data frame with over 400 samples and only want to compare them to the wild type, not to each other. I can't find any possibility to do so, though, neither with DESeq() nor with nbinomWaldTest().

Is it in this case recommended to reduce the data set before using DESeq2, for each comparison, or should we try to run the DESeq() function once and then save it to extract only the relevant results?

deseq2 deseq • 1.4k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 6 hours ago
United States

hi Lilith,

Yes, definitely a good thing. DESeq2 is more sensitive than DESeq, and the older version isn't being developed since ~2012/2013.

Can you explain a bit more, how do the 400 samples break down into groups? What are the sample sizes per group? Is there any other relevant structure to the samples?

ADD COMMENT
0
Entering edit mode

Hey Michael,

I'm afraid they're either single or duplicates (depending on the project). We know replicates are important, but with 400 - 500 clinical isolates it just wasn't feasible. So the problem is that we don't really want to compare more than 400 samples against each other. In the old protocols, we always normalised over all samples at once and then did the comparisons we needed, but that doesn't seem possible any more now.

Sarah

ADD REPLY
0
Entering edit mode

That's fine, but what does table(condition) look like?

How many WT do you have, which you are comparing against?

ADD REPLY
0
Entering edit mode

The wild type is also just one sample, apparently, so the conditions df is basically just a df with the sample names as "conditions".

ADD REPLY
0
Entering edit mode

You want to compare each of 400 samples against a single WT sample? I just want to offer the best advice possible, but maybe it would be good if you could explain what kind of data this is? For how many of the 400 samples is there any replication, if only a duplicate? When you have duplicates, is it a technical replicate (just sequencing the same library) or is it a biological replicate? This is all relevant to what kind of advice I would give you. 

The way you can make the comparisons is to run DESeq() once, with WT as the reference level of the condition variable, and then extract results tables for each comparison:

res <- results(dds, contrast=c("condition", "X", "WT"))

where you would fill in "X" with whichever sample you want to compare to WT. 

ADD REPLY
0
Entering edit mode

In the example that caused me to ask this question we have no replicates at all, not even for the WT. We did RNA-Seq of ~400 clinical isolates (bacteria, grown in the lab, not ex vivo samples) and we're looking at their general transcription profiles as well as differential expression in comparison to the lab-grown WT.

Running DESeq() on all the data and just extracting the results we need seems to work, thank you. Apparently my colleague only looks at a few samples at a time. Do you think using save() or saveRDS() is a good way to keep the DESeq() results available?

ADD REPLY
0
Entering edit mode
Yes, save() the dds object makes sense
ADD REPLY
0
Entering edit mode

Thank you for your help!

ADD REPLY

Login before adding your answer.

Traffic: 607 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6