DESeq2 question regarding log2FC
1
0
Entering edit mode
@kuenne-carsten-6658
Last seen 9.6 years ago
Hi, I just compared the same dataset using DESeq 1 and DESeq 2. Strikingly, while the baseMeans are the same for the same gene, the log2FoldChange is actually different!? There must be an error in the R script I am using or how can that difference be explained? DESEQ1 id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval padj "real" log2FC comp587382_c0_seq1 64.669 13.909 115.429 8.300 3.053 0.000006 0.004475 3.052927 comp364180_c1_seq1 501.804 177.331 826.277 4.660 2.220 0.000004 0.002997 2.220183 DESEQ2 id baseMean baseMeanA baseMeanB log2FoldChange lfcSE pval padj "real" log2FC comp587382_c0_seq1 64.669 13.909 115.429 2.427 0.350 0.000000 0.000000 3.052927 comp364180_c1_seq1 501.804 177.331 826.277 1.924 0.299 0.000000 0.000000 2.220183 [] library(DESeq2) data = read.table("counts.matrix", header=T, row.names=1, com='') col_ordering = c(1,2,3,4) rnaseqMatrix = data[,col_ordering] rnaseqMatrix = round(rnaseqMatrix) rnaseqMatrix = rnaseqMatrix[rowSums(rnaseqMatrix)>=2,] conditions = data.frame(conditions=factor(c(rep("nha", 2), rep("nhc", 2)))) rownames(conditions) = colnames(rnaseqMatrix) ddsFullCountTable <- DESeqDataSetFromMatrix( countData = rnaseqMatrix, colData = conditions, design = ~ conditions) dds = DESeq(ddsFullCountTable) res = results(dds) baseMeanA <- rowMeans(counts(dds, normalized=TRUE)[,colData(dds)$condition == "nha"]) baseMeanB <- rowMeans(counts(dds, normalized=TRUE)[,colData(dds)$condition == "nhc"]) res = cbind(baseMeanA, baseMeanB, as.data.frame(res)) res = cbind(id=rownames(res), as.data.frame(res)) write.table(as.data.frame(res[order(res$pvalue),]), file='counts.matrix.nha_vs_nhc.DESeq2.DE_results', sep=' ', quote=FALSE, row.names=FALSE) [] Regards, Carsten [[alternative HTML version deleted]]
DESeq DESeq • 2.5k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 2 hours ago
United States
hi Carsten, See Figure 1: MA-plot in the vignette which describes the fold change shrinkage: vignette("DESeq2") and also check out preprint describing the methods (emphasis added on first 5 words). *Moderated estimation of fold change* and dispersion for RNA-Seq data with DESeq2 http://dx.doi.org/10.1101/002832 Mike On Wed, Jul 23, 2014 at 8:22 AM, Kuenne, Carsten <carsten.kuenne at="" mpi-bn.mpg.de=""> wrote: > Hi, > > I just compared the same dataset using DESeq 1 and DESeq 2. Strikingly, while the baseMeans are the same for the same gene, the log2FoldChange is actually different!? There must be an error in the R script I am using or how can that difference be explained? > > DESEQ1 > > > > > > > > > > > > id > > baseMean > > baseMeanA > > baseMeanB > > foldChange > > log2FoldChange > > pval > > padj > > "real" log2FC > > comp587382_c0_seq1 > > 64.669 > > 13.909 > > 115.429 > > 8.300 > > 3.053 > > 0.000006 > > 0.004475 > > 3.052927 > > comp364180_c1_seq1 > > 501.804 > > 177.331 > > 826.277 > > 4.660 > > 2.220 > > 0.000004 > > 0.002997 > > 2.220183 > > > > > > > > > > > > > DESEQ2 > > > > > > > > > > > > id > > baseMean > > baseMeanA > > baseMeanB > > log2FoldChange > > lfcSE > > pval > > padj > > "real" log2FC > > comp587382_c0_seq1 > > 64.669 > > 13.909 > > 115.429 > > 2.427 > > 0.350 > > 0.000000 > > 0.000000 > > 3.052927 > > comp364180_c1_seq1 > > 501.804 > > 177.331 > > 826.277 > > 1.924 > > 0.299 > > 0.000000 > > 0.000000 > > 2.220183 > > > > > [] > library(DESeq2) > data = read.table("counts.matrix", header=T, row.names=1, com='') > col_ordering = c(1,2,3,4) > rnaseqMatrix = data[,col_ordering] > rnaseqMatrix = round(rnaseqMatrix) > rnaseqMatrix = rnaseqMatrix[rowSums(rnaseqMatrix)>=2,] > conditions = data.frame(conditions=factor(c(rep("nha", 2), rep("nhc", 2)))) > rownames(conditions) = colnames(rnaseqMatrix) > ddsFullCountTable <- DESeqDataSetFromMatrix( > countData = rnaseqMatrix, > colData = conditions, > design = ~ conditions) > dds = DESeq(ddsFullCountTable) > res = results(dds) > baseMeanA <- rowMeans(counts(dds, normalized=TRUE)[,colData(dds)$condition == "nha"]) > baseMeanB <- rowMeans(counts(dds, normalized=TRUE)[,colData(dds)$condition == "nhc"]) > res = cbind(baseMeanA, baseMeanB, as.data.frame(res)) > res = cbind(id=rownames(res), as.data.frame(res)) > write.table(as.data.frame(res[order(res$pvalue),]), file='counts.matrix.nha_vs_nhc.DESeq2.DE_results', sep=' ', quote=FALSE, row.names=FALSE) > [] > > Regards, > Carsten > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6