DESeq2 vs edgeR results analysis
2
0
Entering edit mode
adR ▴ 40
@do-it-23093
Last seen 3 months ago
Germany, München

Hi Dear Scientist, Thank you so much for the platform and as usual, I may have your few minutes to my question? I used both DESeq2 and edgeR to analyze my RNAseq data. However, I found a higher number of significant genes in my DESeq2 analysis compared to edgeR. The difference is like 1000 which is high in my opinion. Here I posted the code I used below and please show me my mistake case I missed something. My variable(sample) is continuous data.

## edgeR
 x <- DGEList(counts = muscle, group = Sample)
 design <- model.matrix(~Sample)
fit <- estimateDisp(fat, design = design, robust = TRUE)
QL <- glmQLFit(fit, design = design)
table(p.adjust(QL$table$PValue, method="BH")<0.05) #### 5928 genes
### DESeq2
dds <- DESeqDataSetFromMatrix(countData = countData,
                              colData = colData,
                              design = ~ Sample)
dds <- DESeq(dds, fitType = "mean")
resultsNames(dds )
Sample <- results(dds)
sum(Sample$padj < 0.05, na.rm = TRUE) #### 6042 genes

Thank you so much! Best, Amare

deseq2 edgeR • 10k views
ADD COMMENT
2
Entering edit mode
swbarnes2 ★ 1.4k
@swbarnes2-14086
Last seen 2 days ago
San Diego

They are different algorithms. They are going to return different answers. Without knowing how many genes overlap between those two sets, I'd say both programs returned the same results; 6000 genes.

ADD COMMENT
0
Entering edit mode

I'm going to echo "swbarnes2". This has been asked and answered, even recently, on the support site. The methods are different, and it's a lot less interesting or surprising when you note that a method might call a gene DE because adj p = 0.04, and another method might call a gene not DE because adj p = 0.06.

We recommend on the site (and have on many previous threads), pick a tool and use it, but it's not a good idea to alternate through various methods on the dataset you are going to use these methods for analysis. You can certainly do this on another dataset in order to choose which method to use, or just look at the dozens of papers comparing methods systematically on simulated and real datasets.

ADD REPLY
0
Entering edit mode

Thank you so much for your replay. All of the DE genes (adjp < 0.05) I found from edgeR analysis are actually present in the DESeq2 analysis result(adjp<0.05). Thanks!

ADD REPLY
1
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

I answered your question two weeks ago: Design edger with one or more continues variables

I am astonished that you find a 2% difference in the number of DE genes to be large or surprising, especially considering that the edgeR QL method is specifically designed to offer more rigorous error rate control (be more conservative) than negative binomial DE pipelines based on likelihood ratio tests or Wald tests. The difference in DE genes is about 100, not 1000 as you say in your question. To me the results from the two packages seem remarkably consistent.

ADD COMMENT
0
Entering edit mode

I really thank you so much!
Now it is corrected and as you said the difference is almost 100-150. My problem is now solved! Best!

ADD REPLY

Login before adding your answer.

Traffic: 480 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6