DESeq2: Including known contaminant in the design model does not change PCA plot
1
0
Entering edit mode
@nilssonp386-14846
Last seen 5.3 years ago

I have a data set of 51 samples over 4 different conditions, and I want to visualise the similarity between the groups. I have already identified a known blood contamination which affects 7 of the samples, and have added a column named "contamination", with the labels "yes" or "no".

However, when I include this term in the design matrix, it does not affect the appearance of the PCA plot. It looks the same as without the term, and the 7 samples are outliers in reference to the other samples of the same condition.

Code:

d.deseq <- DESeqDataSetFromMatrix(countData = raw_counts,
                                  colData = sample_data,
                                  design = ~ contamination + condition)

vsd <- vst(d.deseq, blind=FALSE)

pcaData <- plotPCA(vsd, intgroup=c("condition"), returnData = TRUE)
percentVar <- round(100 * attr(pcaData, "percentVar"))

p <- ggplot(pcaData, ...)

Thanks a lot for help with troubleshooting and/or other suggestion how to deal with the contamination.

deseq2 • 676 views
ADD COMMENT
0
Entering edit mode
ADD COMMENT
0
Entering edit mode

Thank you, sorry I've missed that. Does this take the treatment condition into account when correcting the batch effects, or does the design have to be provided to the ´removeBatchEffect´ function? I've looked in to the limma documentation but they use other input objects to start with.

ADD REPLY
0
Entering edit mode

You would just provide the batch variable to that function, not the full design.

ADD REPLY

Login before adding your answer.

Traffic: 938 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6