Deseq2 Multiple condition issue
1
1
Entering edit mode
@jose-meseguer-19503
Last seen 5.3 years ago
Edinburgh

Hello!

I have a RNA seq dataset with one cell type and 8 different conditions/treatments in triplicates. I would like to compare one of the condition vs the other seven to see expression changes over the different treatments.

I thought I could use the solution offered in this post (https://support.bioconductor.org/p/86347/)

I designed my dds as follows:

ddsAll <- DESeqDataSetFromMatrix(
  countData = cts,
  colData = coldata,
  design = ~ condition   )

It works well when I compare treatment 1 vs treatment 2 but when I try to do something like:

res1  <- results (ddsAll, contrast = list ( c("g1"), c('g2','g3','g4','g5','g6','g7')) ,    listValues=c(1, -1/7))

it gives me this error:

Error in checkContrast(contrast, resNames) : 
  all elements of the contrast as a list of length 2 should be elements of 'resultsNames(object)'

I tried to see whether someone had similar issues but I could not find anything. Worst case scenario I can do one comparison at a time but it would be great if I could do the analysis to all the samples at once.

Many thanks!

Jose

deseq2 • 4.4k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 15 hours ago
United States

When you type resultsNames(dds), do you see g1? You should put the exact character string that you see in resultsNames(dds), as the error message is indicating.

ADD COMMENT
0
Entering edit mode

Hi Michael,

Many thanks for your time! Sorry it was a silly mistake, it did worked but resultsNames(dds) gives me 'condition_g2_vs_g1' rather than comparing g1 vs the rest, which I am not sure it is not that informative.

I think I did not understood the function and use of :

res1 <- results (ddsAll, contrast = list ( c("g1"), c('g2','g3','g4','g5','g6','g7')) , listValues=c(1, -1/7))

Ideally I would like to compare changes of the different treatments (g2-g7) vs g1 individually. so the previous method does work for me.

I am planning to use LRT test rather than Wald tests on my dds and then cluster by treatments. dds_lrt <- DESeq(dds, test="LRT", reduced = ~ 1)

Thanks again for your response.

Jose

ADD REPLY
0
Entering edit mode

(note that you should have -1/6 not -1/7 if you want to average over 6 coefficients)

If you want to average the difference between g2 compared to g1, ..., g7 compared to g1, you would do:

results(dds, contrast=list(c("condition_g2_vs_g1",...,"condition_g7_vs_g1")), listValues=1/7)

If you want to repeat this process for each level of condition, you would have to change the factor levels. A way to avoid this would be to use a design of ~condition + 0. Then the factor levels will be g1-g7 and you could use code like you originally had.

ADD REPLY
0
Entering edit mode

The second suggestion seems a good approach, hopefully it works nicely!

Thanks a lot Michael!

ADD REPLY
0
Entering edit mode

Hello Everyone and Michael,

In using Deseq2 to test the associated microbial communities to different soil fractions.

For my experiment I have 4 sampling points, paired triplicates.

For each sampling point I have the microbial community associated to bulk soil (1st condition), and 4 different soil particle sizes.

I want to see which microbes are increase in bulk soil compared to all the other soil fractions togheter.

I thought I have two option: 1)I could create a dummy variable specifically called "bulk soil" or "soil particle size", 2) Or run the experiment with all the different soil fractions as a tested condition and then use: results(dds, contrast=list(c("fraction_1_vs_soil",...,"fraction_4_vs_soil")), listValues=1/4)

Which of the two solutions it is more correct, in your opinion?

When using the second option, if I run the command I get the following error: "Error in results(diagdds, contrast = list(c( length(listValues) == 2 & is.numeric(listValues) is not TRUE"

I believe I need to set: list c(1/4,-1/4), correct?

Thanks a lot in advance!!!

Gabriele

ADD REPLY
0
Entering edit mode

From the top: I'm not sure DESeq2 is appropriate for microbiome analysis, I'm not convinced that there are housekeeping taxa for example to be used for scaling, or if the NB is appropriate for the bimodal data that is seen sometimes.

I prefer the solution above that accounts for the differences across condition groups, and then averaging the coefficients. For listValues, you can put c(1/4,-1/4). The denominator value just won't be used.

ADD REPLY
0
Entering edit mode

Thank you!! For microbiome data, it is widely used now... interesting to know that it might not be appropriate from the creator :).

For the housekeeping genes (or taxa in this case) my understanding is that they need to be genes that are present in all samples, correct? How many of them it is considered a safe number? Thanks again for all the great work!

Gabriele

ADD REPLY
0
Entering edit mode

I don't know what is safe for microbiome data. In RNA-seq we often have thousands of genes with small changes.

ADD REPLY
0
Entering edit mode

Hi Micheal,

Is there an error here in your listValues? Would expect it to be more:

listValues = c(1, -1/6)

Considering it is 1 group vs. 6 on Jose's post

Thanks for the valuable help on here

Matt

ADD REPLY
1
Entering edit mode

Yes you're right, thanks I will correct above.

ADD REPLY
0
Entering edit mode

Oh that was not my code but Jose's actually. I will add to my comment above that his code should by 1/6 not 1/7.

ADD REPLY

Login before adding your answer.

Traffic: 943 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6