DESeq2 contrast give different results for same comparison using standard analysis
1
1
Entering edit mode
Letícia ▴ 10
@leticia-14144
Last seen 6.5 years ago
Brazil/Goiânia/Universidade Federal de …

Hi, 

I tried to analyze RNA-seq data using DESeq2.

I have 2 sample conditions: 1 mg/mL and 2 mg/mL

My control sample is 0 mg/mL.

 

When I tried to identify differentially expressed genes between 1 mg/mL and 0 mg/mL using standard analysis I observed:

out of 12632 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up)     : 2036, 16%
LFC < 0 (down)   : 1681, 13%
outliers [1]     : 1, 0.0079%
low counts [2]   : 1462, 12%
(mean count < 2)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

For standard analysis I used this code:

> sampleFiles<-c("file1.counts","file2.counts","file3.counts","file4.counts","file6.counts")
> sampleCondition<-c('0','0','0','1','1')
> sampleTable<-data.frame(sampleName=sampleFiles, fileName=sampleFiles, condition=sampleCondition)
> ddsHTSeq<-DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~condition)
> colData(ddsHTSeq)$condition<-factor(colData(ddsHTSeq)$condition, levels = c('0','1'))
> dds<-DESeq(ddsHTSeq)
> alpha<- 0.05
> res0.05<-results(dds, alpha=alpha)

 

When I tried to identify differentially expressed genes between 1 mg/mL and 0 mg/mL using contrast I observed:

out of 12730 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up)     : 2163, 17%
LFC < 0 (down)   : 1822, 14%
outliers [1]     : 1, 0.0079%
low counts [2]   : 1474, 12%
(mean count < 2)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

For contrast analysis I used this code:

> sampleFiles<-c("file1.counts","file2.counts","file3.counts","file4.counts","file6.counts", "file7.counts", "file8.counts")
> sampleCondition = c("0", "0", "0", "1", "1", "2", "2")
> sampleName = c("CdC1", "CdC2", "CdC3", "Cd1_1", "Cd1_3", "Cd2_1", "Cd2_2")
> sampleTable = data.frame(sampleName=sampleName, fileName=sampleFiles, condition=sampleCondition)
> ddsHTSeq = DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design= ~ condition)
> colData(ddsHTSeq)$condition=factor(colData(ddsHTSeq)$condition, levels=c("0", "1", "2"))
> dds<-DESeq(ddsHTSeq)
> alpha<-0.05
​> res<- results(dds, contrast=c("condition", "1", "0"), alpha=alpha)

 

I don't understand why the counts are different between the two analysis if the comparison is the same, as I think it is.

Can someone explain why is it different or what I am doing wrong?

 

*I am using the latest version of RStudio and DESeq2.

 

Thank you very much.

Letícia

rnaseq deseq2 • 1.3k views
ADD COMMENT
4
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

You added more samples, and these samples were used to estimate the dispersion, in addition to the first two groups of samples. 

We actually have a FAQ about this in the DESeq2 vignette. 

Our recommendation in that FAQ is to use all the samples from all the groups, and then use 'contrast' to compare groups (i.e. your second analysis here, which has a little more power in detecting differences)

ADD COMMENT
0
Entering edit mode

Got it. Thank you for the quick aswer!

ADD REPLY

Login before adding your answer.

Traffic: 604 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6