Hi,
Can someone explain why the number of differentially expressed genes at FDR<10% changes when the order of variables changes in the design?
Option1: Condition given at the end of the design matrix - 54 genes at FDR<10%
>dds=DESeqDataSetFromMatrix(countData=countData,colData=coldata,design=~Race+RIN_Acsg+GCpercent+condition)
> dds=DESeq(dds)
> ctrlBP=results(dds,contrast=c("condition","Bipolar","Control"))
> ctrlBP=ctrlBP[order(ctrlBP$padj),]
> summary(ctrlBP)
out of 21228 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 30, 0.14%
LFC < 0 (down) : 24, 0.11%
outliers [1] : 0, 0%
low counts [2] : 0, 0%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
Option2: Condition given at the beginning of the design matrix - 24 genes at FDR<10%
>dds=DESeqDataSetFromMatrix(countData=countData,colData=coldata,design=~condition+Race+RIN_Acsg+GCpercent)
> dds=DESeq(dds)
> ctrlBP=results(dds,contrast=c("condition","Bipolar","Control"))
> ctrlBP=ctrlBP[order(ctrlBP$padj),]
> summary(ctrlBP)
out of 21228 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 13, 0.061%
LFC < 0 (down) : 11, 0.052%
outliers [1] : 0, 0%
low counts [2] : 0, 0%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
Thanks,
Nirmala
Hi Mike,
When I ran your example it is TRUE. Not sure why I am getting different answers on my data. Any suggestions will be helpful.
Thanks,
Nirmala
Can you rerun your two code chunks above on a fresh R session to make sure you are actually getting different p-values across different variable ordering?
Thanks for your help Mike. In a new R session they are same. Interesting!