Filtering extraneous samples affects DESeq results. What is going on?
1
0
Entering edit mode
sxv • 0
@sxv-14831
Last seen 4.8 years ago

This has probably been addressed before, but I am wondering why I am seeing different results for contrasting the same groups of samples.

Let's say I have 30 samples, sequenced in two batches, with 10 each having diagnosis: normal, tumor, invasive. The objective is to obtain a list of genes that are differentially expressed between 'normal' and 'tumor' groups

Case 1: Run DESeq on the whole dataset and contrast diagnosis groups of interest:
dds <- DESeqDataSetFromMatrix(countData = reads, colData = design, design = ~ batch + diagnosis)
dds <- DESeq(dds, parallel=TRUE)
res <- results(dds, contrast=c("diagnosis", "normal", "tumor"))

Case 2: Subset the sample list before running DESeq then get contrast:
readssubset <- readcounts[,which(design.file$diagnosis != 'invasive')]
designsubset <- design[design[, "diagnosis"] != 'invasive,]
dds <- DESeqDataSetFromMatrix(countData = reads
subset, colData = design_subset, design = ~ batch + diagnosis)
dds <- DESeq(dds, parallel=TRUE)
res <- results(dds, contrast=c("diagnosis", "normal", "tumor"))

I thought this would lead to the same results but it doesn't. Can anyone help to explain what is the difference here?

Thanks,

deseq deseq2 • 886 views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

This is in fact asked a lot, such that it is also a Frequently Asked Question (FAQ) addressed in the vignette.

ADD COMMENT
0
Entering edit mode

Sure is. And now I am remembering that I read it before. Thanks again for your patience and light speed.

ADD REPLY

Login before adding your answer.

Traffic: 905 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6