Question

Understanding Contrasts when analyzing DESeq2 results

0

Entering edit mode

Amit Indap ▴ 10

@amit-indap-3958

Last seen 2.4 years ago

United States

Dear Bioconductor/DESeq2 community,

I am trying to analyze RNAseq data with DESseq2 and have a question about contrasts and interactions. I have RNAseq data on untreated and treated cells collected at 4 different time points. I have quantified my transcripts using salmon. Initially, I was performing DESeq2 analysis at each time point separately. For example, here is the sample file for testing for differential expression at 2hr:

sample,condition,replicate 1,mock.2hr,1 2,mock.2hr,2 3,mock.2hr,3 4,infected.2hr,1 5,infected.2hr,2 6,infected.2hr,3

Here is my code snippet

# set the factor levels
dds$condition <- factor(dds$condition, levels = c("mock.2hr","infected.2hr"))

#get rid of low count data
keep <- rowSums(counts(dds)) >= 10
dds <- dds[keep,]



#perform the dfe analysis based on negative binomial model and get the results
dds <- DESeq(dds)
res <- results(dds)

But then after reading the documentation on contrasts I re-ran my analysis with the following sample list:

sample,condition,replicate 1,mock.2hr,1 2,mock.2hr,2 3,mock.2hr,3 4,infected.2hr,1 5,infected.2hr,2 6,infected.2hr,3 7,mock.4hr,1 8,mock.4hr,2 9,mock.4hr,3 10,infected.4hr,1 11,infected.4hr,2 12,infected.4hr,3 13,mock.6hr,1 14,mock.6hr,2 15,mock.6hr,3 16,infected.6hr,1 17,infected.6hr,2 18,infected.6hr,3 19,mock.8hr,1 20,mock.8hr,2 21,mock.8hr,3 22,infected.8hr,1 23,infected.8hr,2 24,infected.8hr,3

Similar code snippet

dds <- DESeqDataSetFromTximport(txi, colData = samples, design = ~ condition ) dds$condition <- factor(dds$condition, levels = c("mock.2hr", "infected.2hr", "mock.4hr", "infected.4hr", "mock.6hr", "infected.6hr", "mock.8hr", "infected.8hr"))

#get rid of low count data keep <- rowSums(counts(dds)) >= 10 dds <- dds[keep,]

#perform the dfe analysis based on negative binomial model and get the results dds <- DESeq(dds) res <- results(dds)

resultsNames(dds)

results(dds, contrast=c("condition","mock.2hr","infected.2hr"))

However, when I compare results from the 1st set of results two the second set of results, I was expecting the same results, but they are slightly different

"transcript","baseMean","log2FoldChange","lfcSE","stat","pvalue","padj"

"tx1",175.15081857585,-0.0714999820256329,0.128181252816721,-0.557803738491045,0.576978409578714,0.807919847826457

"tx1",183.390698,0.13396380,0.1621473,0.82618589,0.4086986,0.6802349

I feel like I'm misunderstanding how contrasts work. Can someone help me clear up my confusion?

deseq2 extracting contrasts rnaseq interaction term • 7.8k views

ADD COMMENT • link updated 6.7 years ago by swbarnes2 ★ 1.4k • written 6.8 years ago by Amit Indap ▴ 10

score 1 · Answer 1 · 2018-07-02

I don't think you should expect those two methods to return the same answer...though they both return that the fold change, if any, is very small, and the p-value is so bad, you would conclude in both cases that there is no significant difference.

At the very least, the first way, all samples are being normalized based on whatever gene has the median expression between those 6 samples. The second way, the samples are being normalized by whatever gene is the median expression of all the samples. That's probably a different gene, leading to different normalization, and that might be slightly shifting the normalized counts around to turn a tiny change one way into a tiny change the other way.

score 0 · Answer 2 · 2018-06-28

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 8 days ago

United States

Can you look up the FAQ in the vignette? There is one explaining how you can obtain different results from including all groups or just two.

ADD COMMENT • link 6.8 years ago Michael Love 43k

0

Entering edit mode

Thanks for your reply, Michael. I forgot to check the FAQ. I read If I have multiple groups, should I run all together or split into pairs of groups? Is it because the dispersion parameter is different, then?