Question: DESeq2 vs limma/voom: PCA completely different. How to proceed?
4.2 years ago by
Stephen Turner • 260
Stephen Turner • 260 wrote:
Short story: I ran a PCA on a matrix of counts from an RNA-seq experiment and then using the voom-tramsformed data, and they're completely different. I imagine my results will be very different using DESeq2 vs limma, and I am wondering how to proceed. Here are the two PCA biplots: http://imgur.com/a/DBRfE More background: I have an RNA-seq experiment with 5 conditions: WT (n=4), and mutants A, B, C, and D (all n=3). I aligned the reads with STAR, counted reads mapping to genes using HTSeq-count. I imported the count data into DESeq2 and processed using the functions described in the vignette, DESeqDataSetFromHTSeqCount() and DESeq(). I performed a PCA on the transposed normalized counts table from the DESeq Data Set (dds) object (note the outlier, which is no longer outlying in the voom PCA plot below): pca <- as.data.frame(prcomp(t(na.omit(counts(dds, normalized=TRUE))))$x) col <- 1:5 groups <- colData(dds)$condition plot(PC2~PC1, data=pca, bg=col[groups], pch=21, main="DESeq2") Here's that PCA biplot: http://i.imgur.com/P5hhE9Q.png I then used limma/voom to transform the data, and performed another PCA, which looked completely different. design <- model.matrix(~0+colData(dds)$condition) v <- voom(counts=counts(dds, normalized=TRUE), design=design) pca <- as.data.frame(prcomp(t(na.omit(v$E)))$x) pca(PC2~PC1, data=pca, bp=col[groups], pch=21, main="voom") Here's that PCA (notice the outlier C1 is no longer outlying): http://i.imgur.com/oKlnWh3.png Based on this PCA I imagine my results will look very different using DESeq2 or limma on the voom'd data. I understand that this is due to the PCA on the voom transformation is only considering the normalized log2 expression values, and not considering the inverse variance weights. Now I'm just not sure how to proceed with downstream analysis (or quality control with the voom transformation, in general). Any guidance is highly appreciated. Thanks, Stephen ... > sessionInfo() R version 3.0.0 (2013-04-03) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale:  en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages:  parallel stats graphics grDevices utils datasets methods base other attached packages:  limma_3.16.1 calibrate_1.7.1 MASS_7.3-26  DESeq2_1.0.8 RcppArmadillo_0.3.800.1 Rcpp_0.10.3  lattice_0.20-15 Biobase_2.20.0 GenomicRanges_1.12.2  IRanges_1.18.0 BiocGenerics_0.6.0 BiocInstaller_1.10.0 loaded via a namespace (and not attached):  annotate_1.38.0 AnnotationDbi_1.22.2 DBI_0.2-5  genefilter_1.42.0 grid_3.0.0 locfit_1.5-9  RColorBrewer_1.0-5 RSQLite_0.11.3 splines_3.0.0  stats4_3.0.0 survival_2.37-4 tools_3.0.0  XML_3.95-0.2 xtable_1.7-1 -------------- next part -------------- A non-text attachment was scrubbed... Name: bioc-pca-deseq.png Type: image/png Size: 31145 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20130423="" 99cf6269="" attachment.png=""> -------------- next part -------------- A non-text attachment was scrubbed... Name: bioc-pca-voom.png Type: image/png Size: 30142 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20130423="" 99cf6269="" attachment-0001.png="">
ADD COMMENT • link •