I am completely new to using R and DeSeq2 for RNA-seq analysis. My experimental set up is such that I have 3 biological conditions and 2 replicates per condition. I used featurescount to count transcript per genomic interval. I then plugged this into the following script:
library(DESeq2)
countdata <- read.table("featurescountmatrix.txt", header=TRUE, row.names=1)
countdata <- countdata[ ,6:ncol(countdata)]
colnames(countdata) <- gsub(".[sb]am$", "", colnames(countdata))
countdata <- as.matrix(countdata)
(condition <- factor(c(rep("cond1", 2), rep("cond2", 2),rep("cond3", 2))))
(coldata <- data.frame(row.names=colnames(countdata), condition))
dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~condition)
dds <- DESeq(dds)
res <- results(dds,contrast=c("cond1","cond2","cond3"))
table(res$padj<0.05)
res <- res[order(res$padj), ]
resdata <- merge(as.data.frame(res), as.data.frame(counts(dds, normalized=TRUE)), by="row.names", sort=FALSE)
names(resdata)[1] <- "Gene"
head(resdata)
write.csv(resdata, file="cond123diffexpr-results.csv")
I do not understand, whether from the code above, the p-value is calculated based on the statistical difference between cond1 and cond3 or all three conditions??