Error in RankProd package
1
0
Entering edit mode
Marcus Davy ▴ 680
@marcus-davy-374
Last seen 7.4 years ago
Error in RankProd package, specifically related to the RP and topGene functions and the way RP, handles missing values for an entire row of observations for a gene. Filtering out rows of missing values will get around this. The help(RP) documentation provides the following information about NA handling; na.rm: if 'FALSE' (default), the NA value will not be used in computing rank. If 'TRUE', the missing values will be replaced by the gene-wise mean of the non-missing values. Gene with all values missing will be assigned "NA" So, in the case where an entire row of observations for a gene are NAs the rank product should be assigned NAs. To illustrate the errors I have generated a 10 by 4 matrix from a normal distribution with two rows of missing values; library(RankProd) set.seed(1) nrows <- 10 ncols <- 4 X <- round(matrix(rnorm(nrows * ncols), nc=ncols),2) X[2:3,] <- NA X RP.out <- RP(X, cl=rep(1,ncols), na.rm=TRUE) RP.out$RPrank RP.out$RPrank[2:3,] # I think the missing values should be ranked at the end # in one case and at the beginning in the other. This # can be controlled in the function 'rank' with the # na.last argument RP.out$Orirank[["class1 > class 2"]] # Additional space in list name RankProd1(X, logged=TRUE, num.class=1, rev.sorting=TRUE)$rank.all RP.out$Orirank[["class1 < class2"]] RankProd1(X, logged=TRUE, num.class=1, rev.sorting=FALSE)$rank.all RP.out$pval tG <- topGene(RP.out, cutoff = 0.1, method="pval", logged=FALSE, logbase=2, gene.names=rownames(X)) # cutoff=0.1 for illustration # First list from topGene is incorrect, filters genes with p-values > 0.1 >TG$Table1 # Outputs: gene.index RP/Rsum FC:(class1/class2) pfp P.value 4 4 2.5149 -0.6625 0.2025 0.0405 2 2 NA NaN 1.1089 0.9980 5 5 3.7417 0.1725 0.5050 0.2020 3 3 NA NaN 0.9980 0.9980 6 6 2.1147 -0.3325 0.1700 0.0170 # List ok TG$Table2 # A row of missing values should not be highly ranked in the first three genes >topGene(RP.out, logged=FALSE, num.gene=3, logbase=2, gene.names=rownames(X)) # Outputs; Table1: Genes called significant under class1 < class2 Table2: Genes called significant under class1 > class2$Table1 gene.index RP/Rsum FC:(class1/class2) pfp P.value 4 4 2.5149 -0.6625 0.2025 0.0405 2 2 NA NaN 1.1089 0.9980 5 5 3.7417 0.1725 0.5050 0.2020 \$Table2 gene.index RP/Rsum FC:(class1/class2) pfp P.value 1 1 1.6266 0.79 0.035 0.0035 7 7 5.1800 -0.02 0.555 0.3885 3 3 NA NaN 0.998 0.9980 > sessionInfo() R version 2.4.1 (2006-12-18) powerpc-apple-darwin8.8.0 locale: C attached base packages: [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" [7] "base" other attached packages: limma RankProd "2.9.8" "2.6.0" This was also tested on the BioC 2.0 development release of RankProd 2.7.0 and gives the same results. Marcus ______________________________________________________ The contents of this e-mail are privileged and/or confidenti...{{dropped}}
limma RankProd limma RankProd • 1.2k views
0
Entering edit mode
@gunther-honing-1868
Last seen 7.4 years ago
Dear list, currently I'm trying to process 35 cel-files of the Affymetrix Human U133 Plus 2.0 chip. I load the libraries and read the chips into an affybatch. Background correction works well but during normalisation I always receive: "Cannot allocate vector of size 352848...") I am using openlinux 10.2 with 2 GB RAM. Does anyone know a solution to this problem ? Using more RAM ? Gunther Code: library("affy") library("hgu133plus2probe") library("hgu133plus2cdf") chips <- ReadAffy() bg <- bg.correct(chips, method ="rma") norm <- normalize(bg, method ="quantiles")
0
Entering edit mode
Hi Gunther, Gunther H?ning wrote: > Dear list, > > currently I'm trying to process 35 cel-files of the Affymetrix Human U133 > Plus 2.0 chip. > > I load the libraries and read the chips into an affybatch. > Background correction works well but during normalisation I always receive: > "Cannot allocate vector of size 352848...") > > I am using openlinux 10.2 with 2 GB RAM. > Does anyone know a solution to this problem ? > Using more RAM ? Unless you specifically need to do things stepwise, a simple eset <- rma(chips) should suffice. This should work fine with that much RAM. Otherwise, yes. More RAM. Best, Jim > > Gunther > > Code: > > library("affy") > library("hgu133plus2probe") > library("hgu133plus2cdf") > chips <- ReadAffy() > bg <- bg.correct(chips, method ="rma") > norm <- normalize(bg, method ="quantiles") > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
0
Entering edit mode
James is correct. You could also try justRMA() in a fresh R session, preferably with minimal number of other programmes running in the background. Or ask a friend/collaborator who has access to a machine with more RAM or buy more RAM. Regards, Adai James W. MacDonald wrote: > Hi Gunther, > > Gunther H?ning wrote: >> Dear list, >> >> currently I'm trying to process 35 cel-files of the Affymetrix Human U133 >> Plus 2.0 chip. >> >> I load the libraries and read the chips into an affybatch. >> Background correction works well but during normalisation I always receive: >> "Cannot allocate vector of size 352848...") >> >> I am using openlinux 10.2 with 2 GB RAM. >> Does anyone know a solution to this problem ? >> Using more RAM ? > > Unless you specifically need to do things stepwise, a simple > > eset <- rma(chips) > > should suffice. This should work fine with that much RAM. > > Otherwise, yes. More RAM. > > Best, > > Jim > > >> Gunther >> >> Code: >> >> library("affy") >> library("hgu133plus2probe") >> library("hgu133plus2cdf") >> chips <- ReadAffy() >> bg <- bg.correct(chips, method ="rma") >> norm <- normalize(bg, method ="quantiles") >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >