Different p-value using input files with pairwise and multiple condition design
1
0
Entering edit mode
Louis Kok • 0
@louis-kok-17311
Last seen 5.6 years ago
Singapore

Hi,

 

I have run DE analysis using different input files. One is with multiple conditions and another one is just pairwise comparison between treatment and control. I found that the p-value is kind of different. The input files are as below:

For multiple conditions, the input file is as below:

condition       replicate       batch
Ctrl_A  1       1
T1_A    1       1
T2_A    1       1
Ctrl_B  1       1
T1_B    1       1
T2_B    1       1
Ctrl_C  1       1
T1_C    1       1
T2_C    1       1
Ctrl_D  1       1
T1_D    1       1
T2_D    1       1
Ctrl_A  2       2
T1_A    2       2
T2_A    2       2
Ctrl_B  2       2
T1_B    2       2
T2_B    2       2
Ctrl_C  2       2
T1_C    2       2
T2_C    2       2
Ctrl_D  2       2
T1_D    2       2
T2_D    2       2
Ctrl_A  3       3
T1_A    3       3
T2_A    3       3
Ctrl_B  3       3
T1_B    3       3
T2_B    3       3
Ctrl_C  3       3
T1_C    3       3
T2_C    3       3
Ctrl_D  3       3
T1_D    3       3
T2_D    3       3

I would like to compare only T1_D and Ctrl_D. The code is as below:

directory="./"
datList=read.table("multiple.input",header=TRUE)
sampleTable=data.frame(datList)

ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
                                       directory = directory,
                                       design= ~ batch + condition)

ddsHTSeq$condition <- factor(ddsHTSeq$condition)
#dds <- estimateSizeFactors(ddsHTSeq)
#nc <- counts(dds, normalized=TRUE)
#filter <- rowSums(nc >= 1) >= 1
#dds <- dds[filter,]
dds <- DESeq(ddsHTSeq)
res <- results(dds, contrast=c("condition", "T1_D","Ctrl_D")
resOrdered <- res[order(res$pvalue),]
resSig <- subset(resOrdered, padj < 0.1)

 

For pairwise conditions (T1_D vs. Ctrl_D), the input file is as below:

condition       replicate       batch
Ctrl_D  1       1
T1_D    1       1
Ctrl_D  2       2
T1_D    2       2
Ctrl_D  3       3
T1_D    3       3

 

The code is as below:

directory="./"
datList=read.table("pairwise.input",header=TRUE)
sampleTable=data.frame(datList)

ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
                                       directory = directory,
                                       design= ~ batch + condition)

ddsHTSeq$condition <- relevel(ddsHTSeq$condition, ref = "Ctrl_D")
dds <- DESeq(ddsHTSeq)
res <- results(dds)
resOrdered <- res[order(res$pvalue),]
resSig <- subset(resOrdered, padj < 0.1)

 

 

I found that the number of genes with significant expression is different due to difference in p-value and adjusted p-value when the multiple conditions and pairwise conditions are used separately. Is there some error in the code? Thanks a lot.

 

 

 

deseq2 • 2.1k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 31 minutes ago
United States

This is expected and it’s  one of the FAQ in the vignette.

ADD COMMENT
0
Entering edit mode

Thanks Michael. 

ADD REPLY

Login before adding your answer.

Traffic: 1032 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6