Differing results from plotMDS and PCAtools
1
0
Entering edit mode
Fabian • 0
@25565e27
Last seen 8 months ago
Austria

Hi, I am trying out some packages for PCA in combination with the reads-to-genes edgeR-workflow for my RNA-seq analysis. I used the directions given from this post: Trying to do a PCA using edgeR with expression data

Group is equivalent to the Biomarkerquartiles_Q variable.

When I use the plotMDS function with the following code:

keep <- filterByExpr(y, group = y$samples$group)

y <- y[keep, , 
       keep.lib.sizes = F]

y <- calcNormFactors(y)

pch <- c(0,1,2,15,16,17)
colors <- rep(c("darkgreen", "red", "blue"), 2)
plotMDS(y, col=colors[factor(targets$Biomarkerquartiles_Q)],
        pch=pch[factor(targets$Biomarkerquartiles_Q)],
        gene.selection="common")

legend("topleft", legend=levels(factor(targets$Biomarkerquartiles)),
       pch=pch, col=colors, ncol=2)

I get this plot:

plotMDS_Output

And although there is no clear clustering of the quartiles, it is pretty clear that there is some clustering going on.

When I use the PCAtools package with the following code:

p <- pca(cpm(y, log = T),
         metadata = y$samples,
         removeVar = 0.1)
PCA.plot <- biplot(p,
    lab = paste0(p$metadata$group),
    colby = "group"
)

I get the following plot: PCAtools

I know both packages use different approaches but is it expected that the result can differ so much? Intriguingly, the x-axis and y-axis descriptions of both approaches seem to report comparable amounts of variations explained by the PCs.

Additionally, where could I begin to learn how to interpret the results from the plotMDS output?

edgeR PCAtools • 1.2k views
ADD COMMENT
0
Entering edit mode

I looked into some other variables and noticed that sex is probably the root cause of this batch effect:

pch <- c(0,1,2,15,16,17)colors <- rep(c(#"darkgreen",
                "red", "blue"), 2)
plotMDS(y, col=colors[factor(targets$Male)],
        pch=pch[factor(targets$Male)],
        gene.selection="common")

legend("topleft", legend=levels(factor(targets$Male, labels = c("Female", "Male"))),
       pch=pch, col=colors, ncol=2)

.enter image description here

Would it be enough to include it into my design matrix of the experiment and adjust for it or do I have to use limma::removeBatchEffect?

ADD REPLY
1
Entering edit mode

The batch effect is not perfectly aligned with sex, with 3 samples not agreeing. I suggest you investigate the two male samples in the top right and the one female in the lower left. Either these samples are annotated incorrectly, or the batch effect has a different cause that is confounded with gender.

Yes, batch effects are always included in the design matrix. removeBatchEffect is only for graphical exploration (as the help page for removeBatchEffect will tell you).

ADD REPLY
2
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

The reason why the two plots are different is that the first plot is a PCA plot while the second is a biplot. Quite different plots.

The PCA plot from edgeR plotMDS is showing very clearly that your data has a massive batch effect, which is something that you need to investigate. The biplot doesn't seem to be showing any useful information.

Any batch effect should be included in the design matrix.

The results from plotMDS seem pretty clear. You can see examples of MDS plots and their interpretation in the limma and edgeR case studies. As the documentation explains, distances on the MDS plot correspond to log2 leading fold-changes.

ADD COMMENT
0
Entering edit mode

I really like the visualizations and functionality of the PCAtools package.

I looked through this vignette PCAtools: everything Principal Component Analysis and could only find the biplot function for my purpose, which as you said is not the same. Is there some kind of modification of the biplot function to get a PCA plot?

ADD REPLY
0
Entering edit mode

Sorry, I can't help you with the PCAtools package.

ADD REPLY

Login before adding your answer.

Traffic: 457 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6