Dear all,
I am trying to use DESeq2 to analyze my data. I have 2 conditions (wild type and mutant), and 3 samples for each condition. My samples are paired: wt1 with mutant1, wt2 with mutant2, wt3 with mutant3.
I am using R version 3.2.0 (2015-04-16) and DESeq2_1.8.1.The first script I wrote starts as followed:
d <- read.delim(infiles) m <- c(rep("MU",3),rep("WT",3)) cd <- data.frame(id=colnames(d), line=m, pair=c(1,2,3,1,2,3)) cd$line <- relevel(cd$line, ref="WT") dds <- DESeqDataSetFromMatrix(countData = d, colData = cd, design = ~ pair + line) dds <- DESeq(dds) res <- results(dds) sig <- res[which(res$padj<0.01),]
It ran and gave me a list of differentially expressed genes. However, reading the DESeq2 manual, I realized that "pair" should be a factor, rather than a numeric. So I reran the previous script, with a modified 3rd line:
cd <- data.frame(id=colnames(d), line=m, pair=factor(c(1,2,3,1,2,3)))
With this factor, about 150 genes which were previously DE are not DE anymore. I don't really understand what is causing this difference. Could you tell me if my design is correct?
Thank you Simon. I was not especially surprised to get different results, but I wanted to know the reason behind (and you explained it in your long answer).
For the threshold, we usually use 0.05 in our analyses. The script posted above is truncated, it creates several output files with different cut-offs.