Error in RankProd package

0

Entering edit mode

Marcus Davy ▴ 680

@marcus-davy-374

Last seen 9.6 years ago

Error in RankProd package, specifically related to the RP and topGene functions and the way RP, handles missing values for an entire row of observations for a gene. Filtering out rows of missing values will get around this. The help(RP) documentation provides the following information about NA handling; na.rm: if 'FALSE' (default), the NA value will not be used in computing rank. If 'TRUE', the missing values will be replaced by the gene-wise mean of the non-missing values. Gene with all values missing will be assigned "NA" So, in the case where an entire row of observations for a gene are NAs the rank product should be assigned NAs. To illustrate the errors I have generated a 10 by 4 matrix from a normal distribution with two rows of missing values; library(RankProd) set.seed(1) nrows <- 10 ncols <- 4 X <- round(matrix(rnorm(nrows * ncols), nc=ncols),2) X[2:3,] <- NA X RP.out <- RP(X, cl=rep(1,ncols), na.rm=TRUE) RP.out$RPrank RP.out$RPrank[2:3,] # I think the missing values should be ranked at the end # in one case and at the beginning in the other. This # can be controlled in the function 'rank' with the # na.last argument RP.out$Orirank[["class1 > class 2"]] # Additional space in list name RankProd1(X, logged=TRUE, num.class=1, rev.sorting=TRUE)$rank.all RP.out$Orirank[["class1 < class2"]] RankProd1(X, logged=TRUE, num.class=1, rev.sorting=FALSE)$rank.all RP.out$pval tG <- topGene(RP.out, cutoff = 0.1, method="pval", logged=FALSE, logbase=2, gene.names=rownames(X)) # cutoff=0.1 for illustration # First list from topGene is incorrect, filters genes with p-values > 0.1 >TG$Table1 # Outputs: gene.index RP/Rsum FC:(class1/class2) pfp P.value 4 4 2.5149 -0.6625 0.2025 0.0405 2 2 NA NaN 1.1089 0.9980 5 5 3.7417 0.1725 0.5050 0.2020 3 3 NA NaN 0.9980 0.9980 6 6 2.1147 -0.3325 0.1700 0.0170 # List ok TG$Table2 # A row of missing values should not be highly ranked in the first three genes >topGene(RP.out, logged=FALSE, num.gene=3, logbase=2, gene.names=rownames(X)) # Outputs; Table1: Genes called significant under class1 < class2 Table2: Genes called significant under class1 > class2 $Table1 gene.index RP/Rsum FC:(class1/class2) pfp P.value 4 4 2.5149 -0.6625 0.2025 0.0405 2 2 NA NaN 1.1089 0.9980 5 5 3.7417 0.1725 0.5050 0.2020 $Table2 gene.index RP/Rsum FC:(class1/class2) pfp P.value 1 1 1.6266 0.79 0.035 0.0035 7 7 5.1800 -0.02 0.555 0.3885 3 3 NA NaN 0.998 0.9980 > sessionInfo() R version 2.4.1 (2006-12-18) powerpc-apple-darwin8.8.0 locale: C attached base packages: [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" [7] "base" other attached packages: limma RankProd "2.9.8" "2.6.0" This was also tested on the BioC 2.0 development release of RankProd 2.7.0 and gives the same results. Marcus ______________________________________________________ The contents of this e-mail are privileged and/or confidenti...{{dropped}}

limma RankProd limma RankProd • 1.8k views

ADD COMMENT • link updated 17.1 years ago by Gunther Höning ▴ 100 • written 17.1 years ago by Marcus Davy ▴ 680

0

Entering edit mode

Gunther Höning ▴ 100

@gunther-honing-1868

Last seen 9.6 years ago

Dear list, currently I'm trying to process 35 cel-files of the Affymetrix Human U133 Plus 2.0 chip. I load the libraries and read the chips into an affybatch. Background correction works well but during normalisation I always receive: "Cannot allocate vector of size 352848...") I am using openlinux 10.2 with 2 GB RAM. Does anyone know a solution to this problem ? Using more RAM ? Gunther Code: library("affy") library("hgu133plus2probe") library("hgu133plus2cdf") chips <- ReadAffy() bg <- bg.correct(chips, method ="rma") norm <- normalize(bg, method ="quantiles")

ADD COMMENT • link 17.1 years ago Gunther Höning ▴ 100

0

Entering edit mode

Hi Gunther, Gunther H?ning wrote: > Dear list, > > currently I'm trying to process 35 cel-files of the Affymetrix Human U133 > Plus 2.0 chip. > > I load the libraries and read the chips into an affybatch. > Background correction works well but during normalisation I always receive: > "Cannot allocate vector of size 352848...") > > I am using openlinux 10.2 with 2 GB RAM. > Does anyone know a solution to this problem ? > Using more RAM ? Unless you specifically need to do things stepwise, a simple eset <- rma(chips) should suffice. This should work fine with that much RAM. Otherwise, yes. More RAM. Best, Jim > > Gunther > > Code: > > library("affy") > library("hgu133plus2probe") > library("hgu133plus2cdf") > chips <- ReadAffy() > bg <- bg.correct(chips, method ="rma") > norm <- normalize(bg, method ="quantiles") > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

ADD REPLY • link 17.1 years ago James W. MacDonald 65k

0

Entering edit mode

James is correct. You could also try justRMA() in a fresh R session, preferably with minimal number of other programmes running in the background. Or ask a friend/collaborator who has access to a machine with more RAM or buy more RAM. Regards, Adai James W. MacDonald wrote: > Hi Gunther, > > Gunther H?ning wrote: >> Dear list, >> >> currently I'm trying to process 35 cel-files of the Affymetrix Human U133 >> Plus 2.0 chip. >> >> I load the libraries and read the chips into an affybatch. >> Background correction works well but during normalisation I always receive: >> "Cannot allocate vector of size 352848...") >> >> I am using openlinux 10.2 with 2 GB RAM. >> Does anyone know a solution to this problem ? >> Using more RAM ? > > Unless you specifically need to do things stepwise, a simple > > eset <- rma(chips) > > should suffice. This should work fine with that much RAM. > > Otherwise, yes. More RAM. > > Best, > > Jim > > >> Gunther >> >> Code: >> >> library("affy") >> library("hgu133plus2probe") >> library("hgu133plus2cdf") >> chips <- ReadAffy() >> bg <- bg.correct(chips, method ="rma") >> norm <- normalize(bg, method ="quantiles") >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >

ADD REPLY • link 17.1 years ago Adaikalavan Ramasamy ★ 1.8k

Login before adding your answer.