Question

DESeq2: Including known contaminant in the design model does not change PCA plot

0

Entering edit mode

nilssonp386 • 0

@nilssonp386-14846

Last seen 5.3 years ago

I have a data set of 51 samples over 4 different conditions, and I want to visualise the similarity between the groups. I have already identified a known blood contamination which affects 7 of the samples, and have added a column named "contamination", with the labels "yes" or "no".

However, when I include this term in the design matrix, it does not affect the appearance of the PCA plot. It looks the same as without the term, and the 7 samples are outliers in reference to the other samples of the same condition.

Code:

d.deseq <- DESeqDataSetFromMatrix(countData = raw_counts,
                                  colData = sample_data,
                                  design = ~ contamination + condition)

vsd <- vst(d.deseq, blind=FALSE)

pcaData <- plotPCA(vsd, intgroup=c("condition"), returnData = TRUE)
percentVar <- round(100 * attr(pcaData, "percentVar"))

p <- ggplot(pcaData, ...)

Thanks a lot for help with troubleshooting and/or other suggestion how to deal with the contamination.

deseq2 • 676 views

ADD COMMENT • link updated 5.3 years ago by Michael Love 41k • written 5.3 years ago by nilssonp386 • 0

score 0 · Answer 1 · 2019-01-23

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 1 day ago

United States

A "frequently asked question":

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#why-after-vst-are-there-still-batches-in-the-pca-plot

ADD COMMENT • link 5.3 years ago Michael Love 41k

0

Entering edit mode

Thank you, sorry I've missed that. Does this take the treatment condition into account when correcting the batch effects, or does the design have to be provided to the ´removeBatchEffect´ function? I've looked in to the limma documentation but they use other input objects to start with.

ADD REPLY • link 5.3 years ago nilssonp386 • 0

0

Entering edit mode

You would just provide the batch variable to that function, not the full design.

ADD REPLY • link 5.3 years ago Michael Love 41k