Hello,
New to Deseq2 analysis and am trying to understand how to interpret results. This may be a very basic question with an obvious answer - so my apologies in advance.
I defined the contrasts (5 treatments) and ran the data through Deseq2. In the results, I am finding for a number of feature/contrast combinations that I have a significant p-value, but the log2 fold change is reported as zero. How is that possible? How do I interpret this result?
Hello,
Thank you for the prompt reply and for your help with this.
I used this code (repeated it for each of the contrasts):
resC1vC2 = results(diagdds, contrast=c("Forage","CSG1","CSG2"), cooksCutoff = FALSE) alpha = 0.05 sigC1vC2 = resC1vC2[which(resC1vC2$padj < alpha), ] sigC1vC2 = cbind(as(sigC1vC2, "data.frame"), as(taxtable(rarefied)[rownames(sigC1vC2), ], "matrix"))
The table that I showed was a collated table that I created with all of the contrasts in one table. All the rows except for the first row (and column headers) are for the same feature, just with the different contrasts - I thought that was why the statistic was the same, but maybe I am incorrect and it should be different for each contrast? Here is a picture of the actual results output (written to .csv and opened in excel) for one of the contrasts (the C1vC2 contrast that the above code was used to generate):
This output still is showing the zeros in the log2 fold change for two of the features that were identified as significant at p < 0.05.
Can you just show the output from results without the extra code?
E.g. just:
Except writing out the name of the feature in place of
"17446..."
I'm guessing that somewhere in the post-processing of the results table there may be an issue.
Hello,
I tried to run the code you asked for. Obviously the first line was no problem as I have been using similar code. But when I tried to run the second line (with the feature ID in place of the "17446...."), I got an error:
Error: subscript contains invalid names
Not sure what that means. I tried a different feature just to check, but got the same error regardless of which feature ID I am using.
But I did write the results (no extra code!) to a .csv, sorted the table by the adjusted p-value, and found the same output for those zero log2 fold change features that I showed you before. Here is a picture of the results from that first line of code:
So, based on that I would say that it is not an issue with the post-processing?
Can you post your version numbers: sessionInfo()?
I'm not convinced we can rule out that output to CSV and input into another program is the issue. You may be right but I want to be sure that it's coming from results() and not downstream. It should be trivial to just identify this row of the results data.frame and print it in R.
You can also do this:
Sorry for the delay in replying - was out of the office for a few days.
I did get the code line res_test ["17446f8f4b06ac237006826641e2012f",] to run.
Here is the output:
log2 fold change (MLE): Forage CSG1 vs CSG2 LRT p-value: '~ Forage' vs '~ 1' DataFrame with 1 row and 6 columns baseMean <numeric> 17446f8f4b06ac237006826641e2012f 21.0307 log2FoldChange <numeric> 17446f8f4b06ac237006826641e2012f 0 lfcSE <numeric> 17446f8f4b06ac237006826641e2012f 2.22198 stat <numeric> 17446f8f4b06ac237006826641e2012f 32.8491 pvalue <numeric> 17446f8f4b06ac237006826641e2012f 1.28255e-06 padj <numeric> 17446f8f4b06ac237006826641e2012f 0.000228935
This is still showing a Log2FoldChange of 0 for this feature.
Thoughts?
Note this is a likelihood ratio test, so it's possible to have a LFC between two groups of 0, but the LRT is assessing whether any of the groups show a difference.
This is why I wanted to see the code and the simple output from
res
, because either would have indicated that you've performed an LRT.If you want to compare groups, you should specify
test="Wald"
.Got it. Everything in the outputs looks good now. Thank you so much for all of your assistance in figuring out what I was doing wrong and taking the time to explain this to me.