Search
Question: Quantiles from virtualArray example without batcheffect removal are all the same
0
gravatar for Daniel Emden
4.6 years ago by
Daniel Emden10
Daniel Emden10 wrote:
Hi, for my diploma thesis I need to merge several microarray datasets from different platforms. I came across the virtualArray bioconductor package which claims to do exactly what I need. The example from the package vignette runs just fine. I computed the virtual Array without bacheffect removal as follows: same as in the paper: http://www.biomedcentral.com/1471-2105/14/75 # get sample data from the paper GSE23402 <- getGEO("GSE23402") GSE26428 <- getGEO("GSE26428") GSE28688 <- getGEO("GSE28688") # extract ExpSets and reduce data GSE23402 <- GSE23402[[1]][,1:24] GSE26428 <- GSE26428[[1]] GSE28688 <- GSE28688[[1]] # merge via virtualArray library(virtualArray) virtArrays <- list() virtArrays[["wBatchEffects"]] <- virtualArrayExpressionSets(all_expression_sets=c('GSE23402', 'GSE26428', 'GSE28688'), removeBatcheffect=FALSE) # get ExpSet virtArray <- virtArrays[["wBatchEffects"]] # quantiles before merge quantile(exprs(GSE23402)) quantile(exprs(GSE26428)) quantile(exprs(GSE28688)) # quantiles after merge ind1 <- which(pData(virtArray)$Batch=="GSE23402") ind2 <- which(pData(virtArray)$Batch=="GSE26428") ind3 <- which(pData(virtArray)$Batch=="GSE28688") quantile(exprs(virtArray)[,ind1[1:3]]) quantile(exprs(virtArray)[,ind2[1:3]]) quantile(exprs(virtArray)[,ind3[1:3]]) # output # before merge > quantile(exprs(GSE23402)) 0% 25% 50% 75% 100% 3.330558 4.518535 5.883376 8.140574 14.777982 > quantile(exprs(GSE26428)) 0% 25% 50% 75% 100% 0.8432676 2.4635710 5.6495232 8.1862528 14.8623320 > quantile(exprs(GSE28688)) 0% 25% 50% 75% 100% 4.821043 5.613557 5.935004 7.473984 15.387470 # after merge > quantile(exprs(virtArray)[,ind1]) 0% 25% 50% 75% 100% 3.744348 5.217698 6.638584 8.421955 14.758915 > quantile(exprs(virtArray)[,ind2]) 0% 25% 50% 75% 100% 3.744348 5.217698 6.638584 8.421955 14.758915 > quantile(exprs(virtArray)[,ind3]) 0% 25% 50% 75% 100% 3.744348 5.217698 6.638584 8.421955 14.758915 As you can see, the quantiles of the first three samples from each dataset are very different before the merge. After the merge they are all the same. Is that correct? Where is my mistake? For me this looks very strange. I get the same result with batcheffect removal. The values in the exprs(virtArray) are very different but how is it possible, that the quantiles/boxplots are the same? As far as I know the data from getGEO are already normalized. Is the virtualArrayExpressionSets function performing a second normalization? Thanks, Daniel Emden -- > sessionInfo()R version 3.0.0 (2013-04-03) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] hgug4112a.db_2.9.0 hgu133plus2.db_2.9.0 org.Hs.eg.db_2.9.0 RSQLite_0.11.2 DBI_0.2-5 [6] AnnotationDbi_1.22.1 BiocParallel_0.2.0 virtualArray_1.4.0 preprocessCore_1.22.0 plyr_1.8 [11] GEOquery_2.26.1 Biobase_2.20.0 BiocGenerics_0.6.0 loaded via a namespace (and not attached): [1] affy_1.38.0 affyio_1.28.0 affyPLM_1.36.0 BiocInstaller_1.10.0 Biostrings_2.28.0 [6] codetools_0.2-8 foreach_1.4.0 gcrma_2.32.0 grid_3.0.0 IRanges_1.18.0 [11] iterators_1.0.6 lattice_0.20-15 outliers_0.14 quadprog_1.5-4 RCurl_1.95-4.1 [16] reshape2_1.2.2 splines_3.0.0 stats4_3.0.0 stringr_0.6.2 tools_3.0.0 [21] tseries_0.10-30 XML_3.96-1.1 zlibbioc_1.6.0 zoo_1.7-10 [[alternative HTML version deleted]]
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 148 users visited in the last hour