I ran DESeq2 version 2.12.2 with the following code:
Meta<-read.csv("BAT_Meta.csv",header=TRUE,row.names = 1) Raw<-read.csv("BAT_Raw.csv",header=TRUE,row.names = 1) dds<-DESeqDataSetFromMatrix(countData = Raw,colData = Meta,design = ~condition) dds$condition<-factor(dds$condition, levels=c("old","young")) dds<-DESeq(dds) res<-results(dds) resOrdered<-res[order(res$padj),] summary(res) write.csv(as.data.frame(resOrdered),file="BAT_DGE.csv")
summary(res) gives the following:
out of 21981 with nonzero total read count adjusted p-value < 0.1 LFC > 0 (up) : 0, 0% LFC < 0 (down) : 0, 0% outliers [1] : 2247, 10% low counts [2] : 0, 0% (mean count < 0) [1] see 'cooksCutoff' argument of ?results [2] see 'independentFiltering' argument of ?results
Note that I expect a large number of outliers because this was only sequenced to 3M reads/sample. When I view the baseMean column of res, it appears genes with low mean counts are not assigned NA as they should be. I've run the same code on 16 datasets, and 3 of them have this problem. I got about 19.5k genes assigned padj in this dataset, whereas in a dataset that works, only about 12k genes were assigned a padj. Here is the summary(res) of a working dataset:
out of 22097 with nonzero total read count adjusted p-value < 0.1 LFC > 0 (up) : 352, 1.6% LFC < 0 (down) : 127, 0.57% outliers [1] : 2102, 9.5% low counts [2] : 7990, 36% (mean count < 4) [1] see 'cooksCutoff' argument of ?results [2] see 'independentFiltering' argument of ?results
I've even copied the counts from the dataset that doesn't work over the counts of a dataset that does work, keeping everything else constant, and I still get this problem. It appears that there is something about the counts data itself which prevents independent filtering. If I pre-filter with the code:
dds <- dds[ rowSums(counts(dds)) > 1, ]
summary(res) gives:
out of 21094 with nonzero total read count adjusted p-value < 0.1 LFC > 0 (up) : 0, 0% LFC < 0 (down) : 0, 0% outliers [1] : 2246, 11% low counts [2] : 0, 0% (mean count < 0) [1] see 'cooksCutoff' argument of ?results [2] see 'independentFiltering' argument of ?results
I've read through the DESeq2 vignette and have searched for others with this problem, but no luck. Any ideas?