Question: DESeq2: Including known contaminant in the design model does not change PCA plot
25 days ago by
nilssonp3860 wrote:

I have a data set of 51 samples over 4 different conditions, and I want to visualise the similarity between the groups. I have already identified a known blood contamination which affects 7 of the samples, and have added a column named "contamination", with the labels "yes" or "no".

However, when I include this term in the design matrix, it does not affect the appearance of the PCA plot. It looks the same as without the term, and the 7 samples are outliers in reference to the other samples of the same condition.

Code:

d.deseq <- DESeqDataSetFromMatrix(countData = raw_counts,
colData = sample_data,
design = ~ contamination + condition)

vsd <- vst(d.deseq, blind=FALSE)

pcaData <- plotPCA(vsd, intgroup=c("condition"), returnData = TRUE)
percentVar <- round(100 * attr(pcaData, "percentVar"))



Thanks a lot for help with troubleshooting and/or other suggestion how to deal with the contamination.

written 25 days ago by nilssonp3860
Answer: DESeq2: Including known contaminant in the design model does not change PCA plot
25 days ago by
Michael Love21k
Michael Love21k wrote:

Thank you, sorry I've missed that. Does this take the treatment condition into account when correcting the batch effects, or does the design have to be provided to the ´removeBatchEffect´ function? I've looked in to the limma documentation but they use other input objects to start with.