Hello!
I have a RNA seq dataset with one cell type and 8 different conditions/treatments in triplicates. I would like to compare one of the condition vs the other seven to see expression changes over the different treatments.
I thought I could use the solution offered in this post (https://support.bioconductor.org/p/86347/)
I designed my dds as follows:
ddsAll <- DESeqDataSetFromMatrix(
countData = cts,
colData = coldata,
design = ~ condition )
It works well when I compare treatment 1 vs treatment 2 but when I try to do something like:
res1 <- results (ddsAll, contrast = list ( c("g1"), c('g2','g3','g4','g5','g6','g7')) , listValues=c(1, -1/7))
it gives me this error:
Error in checkContrast(contrast, resNames) :
all elements of the contrast as a list of length 2 should be elements of 'resultsNames(object)'
I tried to see whether someone had similar issues but I could not find anything. Worst case scenario I can do one comparison at a time but it would be great if I could do the analysis to all the samples at once.
Many thanks!
Jose
Hi Michael,
Many thanks for your time! Sorry it was a silly mistake, it did worked but resultsNames(dds) gives me 'condition_g2_vs_g1' rather than comparing g1 vs the rest, which I am not sure it is not that informative.
I think I did not understood the function and use of :
res1 <- results (ddsAll, contrast = list ( c("g1"), c('g2','g3','g4','g5','g6','g7')) , listValues=c(1, -1/7))
Ideally I would like to compare changes of the different treatments (g2-g7) vs g1 individually. so the previous method does work for me.
I am planning to use LRT test rather than Wald tests on my dds and then cluster by treatments. dds_lrt <- DESeq(dds, test="LRT", reduced = ~ 1)
Thanks again for your response.
Jose
(note that you should have -1/6 not -1/7 if you want to average over 6 coefficients)
If you want to average the difference between g2 compared to g1, ..., g7 compared to g1, you would do:
If you want to repeat this process for each level of condition, you would have to change the factor levels. A way to avoid this would be to use a design of
~condition + 0
. Then the factor levels will be g1-g7 and you could use code like you originally had.The second suggestion seems a good approach, hopefully it works nicely!
Thanks a lot Michael!
Hello Everyone and Michael,
In using Deseq2 to test the associated microbial communities to different soil fractions.
For my experiment I have 4 sampling points, paired triplicates.
For each sampling point I have the microbial community associated to bulk soil (1st condition), and 4 different soil particle sizes.
I want to see which microbes are increase in bulk soil compared to all the other soil fractions togheter.
I thought I have two option: 1)I could create a dummy variable specifically called "bulk soil" or "soil particle size", 2) Or run the experiment with all the different soil fractions as a tested condition and then use: results(dds, contrast=list(c("fraction_1_vs_soil",...,"fraction_4_vs_soil")), listValues=1/4)
Which of the two solutions it is more correct, in your opinion?
When using the second option, if I run the command I get the following error: "Error in results(diagdds, contrast = list(c( length(listValues) == 2 & is.numeric(listValues) is not TRUE"
I believe I need to set: list c(1/4,-1/4), correct?
Thanks a lot in advance!!!
Gabriele
From the top: I'm not sure DESeq2 is appropriate for microbiome analysis, I'm not convinced that there are housekeeping taxa for example to be used for scaling, or if the NB is appropriate for the bimodal data that is seen sometimes.
I prefer the solution above that accounts for the differences across condition groups, and then averaging the coefficients. For listValues, you can put
c(1/4,-1/4)
. The denominator value just won't be used.Thank you!! For microbiome data, it is widely used now... interesting to know that it might not be appropriate from the creator :).
For the housekeeping genes (or taxa in this case) my understanding is that they need to be genes that are present in all samples, correct? How many of them it is considered a safe number? Thanks again for all the great work!
Gabriele
I don't know what is safe for microbiome data. In RNA-seq we often have thousands of genes with small changes.
Hi Micheal,
Is there an error here in your listValues? Would expect it to be more:
Considering it is 1 group vs. 6 on Jose's post
Thanks for the valuable help on here
Matt
Yes you're right, thanks I will correct above.
Oh that was not my code but Jose's actually. I will add to my comment above that his code should by 1/6 not 1/7.