Error when using impute to get missing values
1
0
Entering edit mode
@david-westergaard-5186
Last seen 6.6 years ago
Hello, I am currently working on the dataset from ArrayExpress, http://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-1560. It has a lot of missing values filled by null, and I am trying to fill these in using impute.knn. However, when I try to do so, I get a lot of errors: *** caught segfault *** address 0x3f94a7e775b5bc59, cause 'memory not mapped' aswell as 2000000007d68000-2000000007d78000 r-xp 00000000 08:03 235217090 /lib/libgcc_s.so.1 which causes R to crash. Sample code looks like: # Read table, which contains two rows of headers Data <- read.table(file=file,header=FALSE,stringsAsFactors=FALSE,sep=" \t",skip=2,na.string='null') hl <- readLines(file,2) hl <- strsplit(hl, '\t') names(Data) <- sub('_$', '', paste(hl[[1]], hl[[2]], sep="_")) # Select only those columns which have the actual preprocessed value, x <- c(1,grep("C57_T40_.*AGILENT_VALUE",names(Data),perl=TRUE)) signals <- Data[,x] hest <- as.matrix(signals[,-1]) # Error occurs at this step. hest2 <- impute.knn(hest) Any help as to why this happens is greatly appreciated. > sessionInfo() R version 2.14.1 (2011-12-22) Platform: ia64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] impute_1.28.0 Best Regards, David Westergaard impute ArrayExpress impute ArrayExpress • 718 views ADD COMMENT 0 Entering edit mode Tim Triche ★ 4.2k @tim-triche-3561 Last seen 7 months ago United States This is a recurring problem with impute. I've tried tracing it and eventually hit a dead end; if the bug is reproducible (it isn't always, for me), running R as a gdb subprocess might help debugging it. If you want, I can give it a shot, time permitting; assuming this happens with this experiment every time, send me a script to reproduce it (as in, retrieve the data, put it in a matrix, and try imputing it) and I will see what I can do. Or ask the maintainer, Balasubramanian Narasimhan (help(package='impute') for email address), who may be able to do it faster. Or he may be busier. Could go either way :-) On Sat, Mar 24, 2012 at 9:36 AM, David Westergaard <s093629@student.dtu.dk>wrote: > Hello, > > I am currently working on the dataset from ArrayExpress, > http://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-1560. It has a > lot of missing values filled by null, and I am trying to fill these in > using impute.knn. However, when I try to do so, I get a lot of errors: > *** caught segfault *** > address 0x3f94a7e775b5bc59, cause 'memory not mapped' > aswell as > 2000000007d68000-2000000007d78000 r-xp 00000000 08:03 235217090 > /lib/libgcc_s.so.1 > > which causes R to crash. > > Sample code looks like: > > # Read table, which contains two rows of headers > Data <- > read.table(file=file,header=FALSE,stringsAsFactors=FALSE,sep="\t",sk ip=2,na.string='null') > hl <- readLines(file,2) > hl <- strsplit(hl, '\t') > names(Data) <- sub('_$', '', paste(hl[[1]], hl[[2]], sep="_")) > # Select only those columns which have the actual preprocessed value, > x <- c(1,grep("C57_T40_.*AGILENT_VALUE",names(Data),perl=TRUE)) > signals <- Data[,x] > hest <- as.matrix(signals[,-1]) > # Error occurs at this step. > hest2 <- impute.knn(hest) > > Any help as to why this happens is greatly appreciated. > > > sessionInfo() > R version 2.14.1 (2011-12-22) > Platform: ia64-unknown-linux-gnu (64-bit) > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] impute_1.28.0 > > Best Regards, > David Westergaard > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
0
Entering edit mode
On 03/24/2012 10:07 AM, Tim Triche, Jr. wrote: > This is a recurring problem with impute. I've tried tracing it and > eventually hit a dead end; if the bug is reproducible (it isn't always, for > me), running R as a gdb subprocess might help debugging it. If you want, I > can give it a shot, time permitting; assuming this happens with this > experiment every time, send me a script to reproduce it (as in, retrieve > the data, put it in a matrix, and try imputing it) and I will see what I > can do. Also, on linux at any rate it's easy to R -d valgrind -f script.R and this usually points to the problem. valgrind is slow so it ends up paying to make the example minimal (save hest and then the commands in script.R will load impute, the data, and then evaluate impute.knn). Martin > > Or ask the maintainer, Balasubramanian Narasimhan (help(package='impute') > for email address), who may be able to do it faster. Or he may be busier. > Could go either way :-) > > > On Sat, Mar 24, 2012 at 9:36 AM, David Westergaard > <s093629 at="" student.dtu.dk="">wrote: > >> Hello, >> >> I am currently working on the dataset from ArrayExpress, >> http://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-1560. It has a >> lot of missing values filled by null, and I am trying to fill these in >> using impute.knn. However, when I try to do so, I get a lot of errors: >> *** caught segfault *** >> address 0x3f94a7e775b5bc59, cause 'memory not mapped' >> aswell as >> 2000000007d68000-2000000007d78000 r-xp 00000000 08:03 235217090 >> /lib/libgcc_s.so.1 >> >> which causes R to crash. >> >> Sample code looks like: >> >> # Read table, which contains two rows of headers >> Data<- >> read.table(file=file,header=FALSE,stringsAsFactors=FALSE,sep="\t",s kip=2,na.string='null') >> hl<- readLines(file,2) >> hl<- strsplit(hl, '\t') >> names(Data)<- sub('_\$', '', paste(hl[[1]], hl[[2]], sep="_")) >> # Select only those columns which have the actual preprocessed value, >> x<- c(1,grep("C57_T40_.*AGILENT_VALUE",names(Data),perl=TRUE)) >> signals<- Data[,x] >> hest<- as.matrix(signals[,-1]) >> # Error occurs at this step. >> hest2<- impute.knn(hest) >> >> Any help as to why this happens is greatly appreciated. >> >>> sessionInfo() >> R version 2.14.1 (2011-12-22) >> Platform: ia64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] impute_1.28.0 >> >> Best Regards, >> David Westergaard >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793