Question

DESeq and EdgeR log fold differences

0

Entering edit mode

Ioannis Vlachos ▴ 40

@ioannis-vlachos-5634

Last seen 11.2 years ago

Hello everyone, I thought of conducting a parallel DE analysis with EdgeR and DESeq using a dataset that I have been working on lately. The dataset has two conditions with two biological replicates each. Let's say: Wild Type, Wild Type, Knock Out, Knock Out. It's a smallRNA-Seq dataset, mapped to miRNAs. I have tried various analyses using both programs and I have noticed this. There are very large differences in fold changes for some miRNAs between the two programs, even when using "RLE" for EdgeR normalization. Example: DESeq Code: countDataSet = newCountDataSet (DATA, condition) countDataSet = estimateSizeFactors(countDataSet) countDataSet = estimateDispersions(countDataSet) difexp = nbinomTest (countDataSet, "WildType", "KnockOut") one of the results is: id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval padj 100 623.8597966 349.3576527 898.3619406 2.57146776 1.362592066 0.001303353 0.182310802 And the size factors for DESeq are: sizeFactors(countDataSet) KO1 KO2 WT1 WT2 1.2969960 1.052 0.8850 0.84442 OK. So far so good. EdgeR now. dge <- DGEList(counts=DATA, group=condition) dge<- calcNormFactors(dge) dge <- estimateCommonDisp(dge, verbose=TRUE) dge <-estimateTagwiseDisp(dge, verbose=TRUE) et<- exactTest(dge) Which results in: logFC logCPM PValue FDR 100 -2.750 5.814103 9.40E-09 1.20E-05 With: dge$samples group lib.size norm.factors WT1 1 2796302 0.9922204 WT2 1 2610244 0.9928183 KO1 2 3999488 1.0248098 KO2 2 3349646 0.9905555 We have logFC 1.3 for DESeq and 2.75 in EdgeR And these results remain practically the same even by using: dge<- calcNormFactors(dge.RLE, method="RLE") logFC logCPM PValue FDR 210 -2.775952 5.823856 8.047631e-09 1.030902e-05 group lib.size norm.factors KnockOut 3999488 1.0147156 KnockOut 3349646 0.9830163 WildType 2796302 0.9903844 WildType 2610244 1.0122579 Any thoughts? This entry has (raw tags): KO KO WT WT 131 123 287 195 Any thoughts on why I get 1.3 lFC vs 2.7lFC? Thanks a lot, Best Regards, Ioannis

Normalization edgeR DESeq Normalization edgeR DESeq • 2.7k views

ADD COMMENT • link updated 12.9 years ago by Gordon Smyth 53k • written 12.9 years ago by Ioannis Vlachos ▴ 40

score 0 · Answer 1 · 2012-11-29

Hi Ioanni, Quick thought: On Thu, Nov 29, 2012 at 5:06 PM, Ioannis Vlachos <iv at="" on.gr=""> wrote: > Hello everyone, > > I thought of conducting a parallel DE analysis with EdgeR and DESeq using a > dataset that I have been working on lately. > > The dataset has two conditions with two biological replicates each. > > Let's say: Wild Type, Wild Type, Knock Out, Knock Out. > > It's a smallRNA-Seq dataset, mapped to miRNAs. > > I have tried various analyses using both programs and I have noticed this. > > There are very large differences in fold changes for some miRNAs between the > two programs, even when using "RLE" for EdgeR normalization. [snip] Have you tried doing a scatterplot of the DESeq logFC vs the edgeR logFC for the same genes? Is the range systematically compressed in one comparison vs. the other, or? Maybe its a function of the `baseMean` values? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact

score 0 · Answer 2 · 2012-11-30

Dear Ioannis, Questions like this require reproducible code examples that other people can run for themselves to confirm the results. See the posting guide. As it is, the inconsistencies in the three sets of output you give (samples in different orders, groups with different names, same miR with different IDs, logFC of different signs, etc) do not give a reader any confidence that the analyses are actually comparable. According to the DESeq output you give, this particular miR is much lower in conditionA (wildtype) than in conditionB (knockout), but the raw data you give show very clearly that the opposite is true. It may be that you have simply given different data to the two packages or else have aligned the results incorrectly. Best wishes Gordon ---------------- original message ----------------- [BioC] DESeq and EdgeR log fold differences Ioannis Vlachos iv at on.gr Thu Nov 29 23:06:17 CET 2012 Hello everyone, I thought of conducting a parallel DE analysis with EdgeR and DESeq using a dataset that I have been working on lately. The dataset has two conditions with two biological replicates each. Let's say: Wild Type, Wild Type, Knock Out, Knock Out. It's a smallRNA-Seq dataset, mapped to miRNAs. I have tried various analyses using both programs and I have noticed this. There are very large differences in fold changes for some miRNAs between the two programs, even when using "RLE" for EdgeR normalization. Example: DESeq Code: countDataSet = newCountDataSet (DATA, condition) countDataSet = estimateSizeFactors(countDataSet) countDataSet = estimateDispersions(countDataSet) difexp = nbinomTest (countDataSet, "WildType", "KnockOut") one of the results is: id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval padj 100 623.8597966 349.3576527 898.3619406 2.57146776 1.362592066 0.001303353 0.182310802 And the size factors for DESeq are: sizeFactors(countDataSet) KO1 KO2 WT1 WT2 1.2969960 1.052 0.8850 0.84442 OK. So far so good. EdgeR now. dge <- DGEList(counts=DATA, group=condition) dge<- calcNormFactors(dge) dge <- estimateCommonDisp(dge, verbose=TRUE) dge <-estimateTagwiseDisp(dge, verbose=TRUE) et<- exactTest(dge) Which results in: logFC logCPM PValue FDR 100 -2.750 5.814103 9.40E-09 1.20E-05 With: dge$samples group lib.size norm.factors WT1 1 2796302 0.9922204 WT2 1 2610244 0.9928183 KO1 2 3999488 1.0248098 KO2 2 3349646 0.9905555 We have logFC 1.3 for DESeq and 2.75 in EdgeR And these results remain practically the same even by using: dge<- calcNormFactors(dge.RLE, method="RLE") logFC logCPM PValue FDR 210 -2.775952 5.823856 8.047631e-09 1.030902e-05 group lib.size norm.factors KnockOut 3999488 1.0147156 KnockOut 3349646 0.9830163 WildType 2796302 0.9903844 WildType 2610244 1.0122579 Any thoughts? This entry has (raw tags): KO KO WT WT 131 123 287 195 Any thoughts on why I get 1.3 lFC vs 2.7lFC? Thanks a lot, Best Regards, Ioannis ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}