Question

Filtering extraneous samples affects DESeq results. What is going on?

0

Entering edit mode

sxv • 0

@sxv-14831

Last seen 5.7 years ago

This has probably been addressed before, but I am wondering why I am seeing different results for contrasting the same groups of samples.

Let's say I have 30 samples, sequenced in two batches, with 10 each having diagnosis: normal, tumor, invasive. The objective is to obtain a list of genes that are differentially expressed between 'normal' and 'tumor' groups

Case 1: Run DESeq on the whole dataset and contrast diagnosis groups of interest:
dds <- DESeqDataSetFromMatrix(countData = reads, colData = design, design = ~ batch + diagnosis)
dds <- DESeq(dds, parallel=TRUE)
res <- results(dds, contrast=c("diagnosis", "normal", "tumor"))

Case 2: Subset the sample list before running DESeq then get contrast:
readssubset <- readcounts[,which(design.file$diagnosis != 'invasive')]
designsubset <- design[design[, "diagnosis"] != 'invasive,]
dds <- DESeqDataSetFromMatrix(countData = readssubset, colData = design_subset, design = ~ batch + diagnosis)
dds <- DESeq(dds, parallel=TRUE)
res <- results(dds, contrast=c("diagnosis", "normal", "tumor"))

I thought this would lead to the same results but it doesn't. Can anyone help to explain what is the difference here?

Thanks,

deseq deseq2 • 1.1k views

ADD COMMENT • link updated 5.7 years ago by Michael Love 43k • written 5.7 years ago by sxv • 0

score 1 · Answer 1 · 2019-07-31

1

Entering edit mode

Michael Love 43k

@mikelove

Last seen 5 days ago

United States

This is in fact asked a lot, such that it is also a Frequently Asked Question (FAQ) addressed in the vignette.

ADD COMMENT • link 5.7 years ago Michael Love 43k

0

Entering edit mode

Sure is. And now I am remembering that I read it before. Thanks again for your patience and light speed.

ADD REPLY • link 5.7 years ago sxv • 0