Question: DESeq2 output number of genes padj <0.1 is 0?
7 months ago by
hs.lansdell10
hs.lansdell10 wrote:

Morning! I've just run DESeq2 on my RNAseq data with a dichotomous outcome, and I'm getting results that mean I have absolutely no deferentially expressed genes...

My input is a count matrix with samples in columns and genes in rows, i.e:

XXX1  XXX2  XXX3

Gene 1

Gene 2

Gene 3

My sample information table:

condition

XXX1     y

XXX2     y

XXX3     n

My code:

colnames(data) <- substring(colnames(data), 2)

#Double check names match up between Sample and data matrix
all(rownames(colData)==colnames(data))

dds<-DESeqDataSetFromMatrix(countData = data, colData = colData, design= ~condition)

dds$condition <- factor(dds$condition, levels = c("no","yes"))

dds<-DESeq(dds)
res<-results(dds)

My results:

summary(res)

out of 20338 with nonzero total read count
LFC > 0 (up)     : 0, 0%
LFC < 0 (down)   : 0, 0%
outliers [1]     : 0, 0%
low counts [2]   : 0, 0%
(mean count < 0)

[1] 0

Since it doesn't follow to have 0 differentially expressed genes, I'm not sure what I've done wrong.

Thanks!

7 months ago by
Michael Love18k
United States
Michael Love18k wrote:

"Since it doesn't follow to have 0 differentially expressed genes, I'm not sure what I've done wrong."

I presume by this that you expect there to be some differentially expressed genes comparing Y to the two N samples.

The reason you cannot detect DE here (presuming there are true differences) is due to the sample size.

2 vs 1 is actually the absolute minimum for the software to be able to estimate variance. A sample size of 3 vs 3 I would consider a practical minimal to have some power to detect large effect sizes, and that only works if there is limited biological variability.

Here is a paper which explores sensitivity as a function of sample size for RNA-seq:

https://www.ncbi.nlm.nih.gov/pubmed/27022035

So, that was just to show how my files are arranged. My input has 183 samples and 20338 genes.