Hi
I'm trying to work with DESeq2, i already worked with edgeR and DESeq1 with relatively sucess.
They give me the same results as logFC (+ or -) but, when i work with DESeq2, i have different results. Every gene who are positive for edgeR and DESeq1 are negative for DESeq2 and vice-versa.
DESeq2 v. 1.14.1
Rstudio v.1.0.136
I'll post my pipeline, thanks for your help!
bckCountTable <- read.table("DF.txt", header=TRUE, row.names=1)
head(bckCountTable)
plot(bckCountTable)
samples <- data.frame(row.names=c("CTL1", "CTL2", "CTL3", "HT1","HT2","HT3"), condition=as.factor(c(rep("R",3),rep("IR",3))))
samples
bckCDS <- DESeqDataSetFromMatrix(countData = bckCountTable, colData=samples, design=~condition)
bckCDS <- bckCDS[ rowSums(counts(bckCDS)) > 1, ]
bckCDS_1 <- DESeq(bckCDS)
res <- results(bckCDS_1, alpha=0.05, betaPrior = F, contrast=c("condition", "CTL", "HT"), pAdjustMethod = "BH")
rld <- rlogTransformation(bckCDS)
print(plotPCA(rld, intgroup=c('condition')))
head(assay(rld))
hist(assay(rld))
plotDispEsts(bckCDS_1)
bckCDS_1 <- estimateSizeFactors(bckCDS_1)
head(res)
table(res$padj<0.05)
resdata <- merge(as.data.frame(res), as.data.frame(counts(bckCDS_1, normalized=TRUE)), by="row.names", sort=FALSE)
names(resdata)[1] <- "Gene"
head(resdata)
X = as.data.frame(res)
plotSparsity(X, normalized=T)
boxplot(bckCountTable)
summary(res)
plotMA(res, ylim=c(-15,15))
topGene <- rownames(res)[which.min(res$pvalue)]
with(res[topGene, ], {
points(baseMean, log2FoldChange, col="dodgerblue", cex=2, lwd=2)
text(baseMean, log2FoldChange, topGene, pos=2, col="dodgerblue")
})
hist(X)
mat = assay(rld)[head(order(res$padj),30),]
mat = mat - rowMeans(mat)
pheatmap(mat)
simple <- ggplot(X, aes(x = log2FoldChange, y = stat)) + geom_point(size = 3, alpha = 0.7, na.rm = T) + theme_bw(base_size = 16) + geom_text_repel(aes(log2FoldChange, stat, label = rownames(bck_res)), colour="black", segment.colour="grey")
simple
In your
contrast=c("condition", "CTL", "HT")
you're saying that you're looking at fold changes relative to "HT" - the second element in the character vector is the numerator, and the third is the level that will be denominator (or 'be subtracted' on the log scale). Guessing that "CTL" means control, and that normally we want fold-changes relative to control, it looks like have got the order of those two levels the unintended way round, so just settingcontrast=c("condition", "HT", "CTL")
should get you back to the more intuitive interpretation that you've already got in edgeR and DESeq1?Thank you Gavin!!
My summary before was
out of 20088 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up) : 17, 0.085%
LFC < 0 (down) : 12, 0.06%
outliers [1] : 48, 0.24%
low counts [2] : 7006, 35%
(mean count < 9)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
NOW is
out of 20088 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up) : 12, 0.06%
LFC < 0 (down) : 17, 0.085%
outliers [1] : 48, 0.24%
low counts [2] : 7006, 35%
(mean count < 9)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
The truth is in minimal details! haha
Really thank for your time.