DESeq2 output number of genes padj <0.1 is 0?
1
0
Entering edit mode
hs.lansdell ▴ 20
@hslansdell-14246
Last seen 7.0 years ago

Morning! I've just run DESeq2 on my RNAseq data with a dichotomous outcome, and I'm getting results that mean I have absolutely no deferentially expressed genes...

My input is a count matrix with samples in columns and genes in rows, i.e:

              XXX1  XXX2  XXX3

Gene 1

Gene 2

Gene 3

My sample information table:

          condition

XXX1     y

XXX2     y

XXX3     n

My code:

data<-read.csv("Input.csv", header=TRUE, row.names = 1, stringsAsFactors = FALSE)
colnames(data) <- substring(colnames(data), 2)


colData<-read.csv("Condition.csv",header = TRUE, row.names = 1)
#Double check names match up between Sample and data matrix
all(rownames(colData)==colnames(data))

dds<-DESeqDataSetFromMatrix(countData = data, colData = colData, design= ~condition)

dds$condition <- factor(dds$condition, levels = c("no","yes"))

dds<-DESeq(dds)
res<-results(dds)

My results:

summary(res)

out of 20338 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)     : 0, 0% 
LFC < 0 (down)   : 0, 0% 
outliers [1]     : 0, 0% 
low counts [2]   : 0, 0% 
(mean count < 0)

sum(res$padj < 0.1, na.rm=TRUE)
[1] 0

Since it doesn't follow to have 0 differentially expressed genes, I'm not sure what I've done wrong.

Thanks!

deseq2 adjusted pvalue • 1.7k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 12 hours ago
United States

"Since it doesn't follow to have 0 differentially expressed genes, I'm not sure what I've done wrong."

I presume by this that you expect there to be some differentially expressed genes comparing Y to the two N samples.

The reason you cannot detect DE here (presuming there are true differences) is due to the sample size.

2 vs 1 is actually the absolute minimum for the software to be able to estimate variance. A sample size of 3 vs 3 I would consider a practical minimal to have some power to detect large effect sizes, and that only works if there is limited biological variability. 

Here is a paper which explores sensitivity as a function of sample size for RNA-seq:

https://www.ncbi.nlm.nih.gov/pubmed/27022035

ADD COMMENT
0
Entering edit mode

So, that was just to show how my files are arranged. My input has 183 samples and 20338 genes. 

ADD REPLY
0
Entering edit mode

And how do you know that the two groups have any differences in gene expression?

ADD REPLY

Login before adding your answer.

Traffic: 558 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6