Dear DESeq2 community,
I am trying to use DESeq2 to detect DEG for my RNAseq data because it can deal with multi-factor designs, although there is not only multi-factor analysis in my experiment data. Let me clarify my datasets firstly.
For my first and second dataset, there is only one factor with 2 levels (They are unbalanced experiments. One treatment level is of 4 biological replication and the other is of 5). So, what I need is just to do 2 pairwise comparisons for each of them by reading the count table (the counts are raw counts got from Htseq-count) by
STR_dds <- DESeqDataSetFromMatrix(countData = STR_count, colData = STR_colData, design = ~ trt )
PFC_dds <- DESeqDataSetFromMatrix(countData = PFC_count, colData = PFC_colData, design = ~ trt ), then doing the differential expression analysis
STR_dds <-DESeq(STR_dds)
PFC_dds <-DESeq(PFC_dds).
In the meantime, I also used Cuffdiff to do the tests. However, when I got the results from these 2 pipelines (Cuffdiff and DESeq2), they are so different. Both of them are with FDR at 0.05. For one of the dataset, there are 325 genes detected as DEG from Cuffdiff, while it is only 6 genes that are differentially expressed by DESeq2. For another dataset, there are 25 DEG from Cuffdiff, while it is only 1 gene from DESeq2. It looks like either Cufflinks is too liberal, or makes false positive calls, or HTseq-DEseq2 is to conservative, or makes false negative calls. It is hard for me to make a conclusion based on the different results.
So, I am trying to see the expression levels for each gene and check the correlation between cuffdiff and DESeq2 results. However, it is RPKM (my data is single-end) from cuffdiff (cufflinks) but baseMean from DESeq2. So, my question is how DESeq2 to calculate the baseMean for each gene? Is there any relationship to RPKM? For baseMean, it is a just average of the normalized count values, taken over all samples. On my understanding, it is normalized only by library size not genes length, right? Could I get FPKM from DESeq2? If I can't, I can also calculate it manually based on gene count and length. But, I wonder know if I could get it from DESeq2 and my answer is no but I don't know the exact reason.
For my third dataset, it is multi-factor design. There are 2 factors in my experiment, genotype and condition, with 2 levels for each of them. I fit a model with interaction. Here are the codes that I used to read data in:
GT_Con_colData <- data.frame(row.names = colnames(GT_Con_countData), GT = c(rep('WT', 11), rep('NEO', 11)), Con = c(rep('MS', 5), rep('NMS', 6), rep('MS', 6), rep('NMS', 5)))
GT_Con_dds <- DESeqDataSetFromMatrix(countData = GT_Con_countData , colData = GT_Con_colData, design = formula(~ GT+Con+GT:Con) )
Then, I did test by
GT_Con_dds <- DESeq(GT_Con_dds)
GT_Con_dds_res <- results(GT_Con_dds)
The result shows that there is nothing significant. How to explain the results and what is the next step based on this non-significant results. Is it showing there is no interaction effect for my data? Actually, for this test, my full model is GT+Con+GT:Con (H1), and my reduced model is GT+Con (H0). Since we fail to reject our H0, we can conclude that the interaction effect is not significant for this dataset. Is my understanding correct? If yes, then the null model is enough to fit my data. Could I just drop/remove the interaction from my model, then the full model is GT+Con? Does is it matter the order of GT and Con because I see from vignette the variable of interest should be put at the end of the formula? In this case, what is the reduced model? Also, could you illuminate what is the code should I use if I am interested in all of contrasts based on this non-signifciant interaction effect?
I hope I make it clear for my questions in this big email (actually, I did some research online and in DESeq2 community. It really helps but I am still confused about my questions). Thank you so much!
Best,
Xiaofei
Hi Michael,
Thank you so much for responses to my questions so quickly! Actually, I read the vignette again and did more research for this project. But, I still have some questions in my mind about DESeq2 (I will 'Add comment' for clarifications here and also ask some new questions).
"This is not so surprising. There are dozens of differences in the statistical procedures of these software. Furthermore p-values are tail probabilities which are highly sensitive to changes in models. My recommendation would be to look at plots of example genes to get a better sense of what is in these sets."
What do you mean by 'example genes'? Do you mean the DEG detected from Cuffdiff? Also, you recommended me to look at the plots, what are kind of plots here?
"This sounds like you are referring to a likelihood ratio test, but you performed a Wald test. They typically produce similar results, but you should be clear that you performed a Wald test, as this is the default for DESeq() followed by results(). There is more information on this in the vignette section on likelihood ratio tests and in the R help page for
?DESeq
."For likelihood ratio test and Wald test, both of them are to do statistical inference and they can both be used to test whether an effect exists or not. By default for DESeq() followed by restuls(), I performed a Wald test. But, I remember LR test can be easily extended to multiple parameters while Wald test is not appropriate for multiple parameters, correct? So, if there are multiple levels for a treatment factor in an experiment, my understanding is that we have to change the default performance for DESeq2, correct?
"What do you mean by 'example genes'? Do you mean the DEG detected from Cuffdiff? Also, you recommended me to look at the plots, what are kind of plots here?"
I meant, genes which you are curious about, for example those not significant in one or the other method. The plotCounts function is used in the vignette.
"But, I remember LR test can be easily extended to multiple parameters while Wald test is not appropriate for multiple parameters, correct? So, if there are multiple levels for a treatment factor in an experiment, my understanding is that we have to change the default performance for DESeq2, correct?"
This is not correct. The default method can handle all kinds of design. In the vignette, the default DESeq method is used in a multifactor design with type of sequencing and with the condition, for example. Having multiple levels for a factor is also handled by the default method, for example in the Examples in ?results. You might have read a older post talking about v1.2, where in the devel branch (v1.3) we added behavior to make fold changes from factors with 3 or more levels symmetric to changes in the base level. But the default methods have always been with simple and complex designs in mind.
I double check my notes for Categorical Analysis which is a class for statistics. I see that Wald test is not for multiple parameters. For example,
H0: ß1=ß2=0
H1: at least one of ß1 and ß2 is not 0, then Wald test is not for this testing because wald test is not for multiple parameters.
Anyway, I may be wrong to understand this for DESeq2, we can ignore this and I can understand it for DESeq2 based on your explanation.
Thanks a lot for replying to me so quick and also at weekend! :)
Oh I see now what you meant by multiple parameters. Math is always best for resolving these things :) Yes, the Wald cannot test the null of multiple coefficients being equal to zero while the LRT can.
So, my understanding is that if we have multiple levels (>2) for a factor, it is similar to this test (H0: ß1=ß2=0 VS H1: at least one of ß1 and ß2 is not 0) when we want to test significance for a main effect. That is why I think I need to change the default Wald test to LRT when I have a factor with more than 2 levels. Could you tell me where I am stuck when I apply this understanding to DESeq2?