Entering edit mode
When changing the order of samples (ensuring that colData and countData are in the same order), this influences the DE results. Very subtly (something like in the 9th digit) - but why does it happen?
Using the pasilla dataset and the code from the vignette :
options(digits = 15)
library(DESeq2)
library(pasilla)
pasCts <- system.file("extdata",
"pasilla_gene_counts.tsv",
package="pasilla", mustWork=TRUE)
pasAnno <- system.file("extdata",
"pasilla_sample_annotation.csv",
package="pasilla", mustWork=TRUE)
cts <- as.matrix(read.csv(pasCts,sep="\t",row.names="gene_id"))
coldata <- read.csv(pasAnno, row.names=1)
coldata <- coldata[,c("condition","type")]
coldata$condition <- factor(coldata$condition)
coldata$type <- factor(coldata$type)
rownames(coldata) <- sub("fb", "", rownames(coldata))
cts <- cts[, rownames(coldata)]
dds <- DESeqDataSetFromMatrix(countData = cts,
colData = coldata,
design = ~ condition)
dds <- DESeq(dds)
res <- results(dds)
print(head(res$log2FoldChange),digits=15)
[1] -1.02604541037965413 -0.00215142369260044
[3] 0.49673556850473838 1.88276170249200669
[5] 0.24002523000310516 0.10479911223675623
When I randomly shuffle coldata and ensure that the counts is in the same order as count data, the logfold changes (very slightly). Why?
coldata2 <- coldata[sample(1:nrow(coldata),replace=F),]
setequal(rownames(coldata2), colnames(cts))
cts2 <- cts[,rownames(coldata2)]
dds2 <- DESeqDataSetFromMatrix(countData = cts2,
colData = coldata2,
design = ~ condition)
dds2 <- DESeq(dds2)
res2 <- results(dds2)
print(head(res2$log2FoldChange),digits=15)
[1] -1.02604541037965524 -0.00215142640531776
[3] 0.49673554526085623 1.88276152384409690
[5] 0.24002523019401523 0.10479911227720901