Question

Apply the plotPCA function in DESeq2 package to quantitative MS-based data

0

Entering edit mode

an.anand233 ▴ 30

@ananand233-9844

Last seen 7.8 years ago

Hi guys,

I wonder if someone has succeeded to apply the plotPCA function to group samples of Mass Spectrometry (MS) based data such as quantitative proteomics. I like much of the hierarchical clustering and PCA ploting functions implemented in DESeq2, but it seems the inputs of these functions are DESeqTransform objects, which is transformed from count-based data. Can someone provide suggestions on how to apply these functions to other types data? Thanks.

All the best,

Zhu

deseq2 • 2.4k views

ADD COMMENT • link updated 7.9 years ago by chris86 ▴ 420 • written 7.9 years ago by an.anand233 ▴ 30

score 1 · Accepted Answer · 2016-05-31

1

Entering edit mode

chris86 ▴ 420

@chris86-8408

Last seen 4.4 years ago

UCL, United Kingdom

Why bother using the DESEQ functions? It is better to just use prcomp function in R for your PCA plot and ggplot2 - it is more customisable (see example below). Again, for clustering you don't have to use DESEQ, read the NMF aheatmap manual for example, or there are other options - again which would be easier to work with than inbuilt DESEQ functions that are wrapped up in so much other stuff it is harder to customise them.

pca1 = prcomp(data2)
scores <- data.frame(pca1$x)
scores <- cbind(scores, factor(names))
colnames(scores)[ncol(scores)] <- 'type'
myColors <- brewer.pal(9,"Set3")
myColors <- sample(myColors)
names(myColors) <- levels(scores$type)
colScale <- scale_colour_manual(name = "type",values = myColors)

row.names(scores) <- des4$ID

ggplot(data = scores, aes(x = PC1, y = PC2, colour = type, label = rownames(scores))) +
geom_point(size = 5) +
theme_bw() +
theme(axis.title=element_text(size=14,face="bold"),
        axis.text=element_text(size=14),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        legend.title=element_blank(),
        legend.text = element_text(size = 14)) + colScale)

ADD COMMENT • link 7.9 years ago chris86 ▴ 420

2

Entering edit mode

The source code for plotPCA is simple and easy to customize, we say so much in the help page ?plotPCA. It is commented to explain what we are doing in each step.

Here's the source:

https://github.com/Bioconductor-mirror/DESeq2/blob/master/R/plots.R#L162-L201

ADD REPLY • link 7.9 years ago Michael Love 41k

0

Entering edit mode

That's just a more complicated version of what I posted. It is better to go the the original packages yourself and just do it that way. That is just my view anyway.

ADD REPLY • link 7.9 years ago chris86 ▴ 420

1

Entering edit mode

Fair enough. That's just like, your opinion, man :)

I'll just say that the selection of rows by highest variance makes a big difference, helps to "bring into focus" the sample clusters.

Also I find that the annotation of percent variance on the axes is useful in assessing what is being shown, answering the question: is this basically all of the sample-sample variabilty being shown, or is the (PC1,PC2) projection showing very little of total variance, because the scree plot is fairly flat.

ADD REPLY • link 7.9 years ago Michael Love 41k