Search
Question: DESeq2 output number of genes padj <0.1 is 0?
0
gravatar for hs.lansdell
19 days ago by
hs.lansdell10
hs.lansdell10 wrote:

Morning! I've just run DESeq2 on my RNAseq data with a dichotomous outcome, and I'm getting results that mean I have absolutely no deferentially expressed genes...

My input is a count matrix with samples in columns and genes in rows, i.e:

              XXX1  XXX2  XXX3

Gene 1

Gene 2

Gene 3

My sample information table:

          condition

XXX1     y

XXX2     y

XXX3     n

My code:

data<-read.csv("Input.csv", header=TRUE, row.names = 1, stringsAsFactors = FALSE)
colnames(data) <- substring(colnames(data), 2)


colData<-read.csv("Condition.csv",header = TRUE, row.names = 1)
#Double check names match up between Sample and data matrix
all(rownames(colData)==colnames(data))

dds<-DESeqDataSetFromMatrix(countData = data, colData = colData, design= ~condition)

dds$condition <- factor(dds$condition, levels = c("no","yes"))

dds<-DESeq(dds)
res<-results(dds)

My results:

summary(res)

out of 20338 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)     : 0, 0% 
LFC < 0 (down)   : 0, 0% 
outliers [1]     : 0, 0% 
low counts [2]   : 0, 0% 
(mean count < 0)

sum(res$padj < 0.1, na.rm=TRUE)
[1] 0

Since it doesn't follow to have 0 differentially expressed genes, I'm not sure what I've done wrong.

Thanks!

ADD COMMENTlink modified 19 days ago by Michael Love15k • written 19 days ago by hs.lansdell10
0
gravatar for Michael Love
19 days ago by
Michael Love15k
United States
Michael Love15k wrote:

"Since it doesn't follow to have 0 differentially expressed genes, I'm not sure what I've done wrong."

I presume by this that you expect there to be some differentially expressed genes comparing Y to the two N samples.

The reason you cannot detect DE here (presuming there are true differences) is due to the sample size.

2 vs 1 is actually the absolute minimum for the software to be able to estimate variance. A sample size of 3 vs 3 I would consider a practical minimal to have some power to detect large effect sizes, and that only works if there is limited biological variability. 

Here is a paper which explores sensitivity as a function of sample size for RNA-seq:

https://www.ncbi.nlm.nih.gov/pubmed/27022035

ADD COMMENTlink modified 19 days ago • written 19 days ago by Michael Love15k

So, that was just to show how my files are arranged. My input has 183 samples and 20338 genes. 

ADD REPLYlink written 19 days ago by hs.lansdell10

And how do you know that the two groups have any differences in gene expression?

ADD REPLYlink written 19 days ago by Michael Love15k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 267 users visited in the last hour