rhdf5, dataframes, and variable length strings
1
0
Entering edit mode
Bernd Fischer ▴ 550
@bernd-fischer-5348
Last seen 7.3 years ago
Germany / Heidelberg / DKFZ
Dear John! Thank you very much for reporting this bug. I can reproduce it on my computer, but it will need some time to fix it. I will let you know, once it is fixed. Best, Bernd On 28.10.2013, at 22:14, John at embl-heidelberg.de wrote: > > Hi all. > > I am working with large data frames in R that contain a mix of numbers and variable-length strings. I've tried using the rhdf5 package to write and then read these and I haven't been able to figure out how to correctly use the package. I'll include a toy data frame that causes R to segfault, at least on my machine. I would greatly appreciate either some pointers about what I'm doing wrong or another way to store my data. > > rndString <- function(n=1){rndString <- c(1:n);for(i in 1:n){rndString[i] <- paste(sample(c(0:9,letters,LETTERS),sample(c(3:20 ),1),replace=TRUE),collapse="")};return(rndString)} > library(rhdf5) > n <- 1000000 > d <- data.frame(id=seq(n),name=rndString(n),val=rnorm(n),stringsAsFa ctors=FALSE) > h5createFile("test.h5") > h5write(d,file="test.h5",name="d") > dd <- h5read("test.h5",name="d") > > John Estrada > > > > -- output of sessionInfo(): > >> sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] rhdf5_2.6.0 > > loaded via a namespace (and not attached): > [1] zlibbioc_1.8.0 > > > -- > Sent via the guest posting facility at bioconductor.org.
• 856 views
ADD COMMENT
0
Entering edit mode
Bernd Fischer ▴ 550
@bernd-fischer-5348
Last seen 7.3 years ago
Germany / Heidelberg / DKFZ
Dear John! Thank you very much for reporting this bug. I can reproduce it on my computer, but it will need some time to fix it. I will let you know, once it is fixed. Best, Bernd On 28.10.2013, at 22:14, John at embl-heidelberg.de wrote: > > Hi all. > > I am working with large data frames in R that contain a mix of numbers and variable-length strings. I've tried using the rhdf5 package to write and then read these and I haven't been able to figure out how to correctly use the package. I'll include a toy data frame that causes R to segfault, at least on my machine. I would greatly appreciate either some pointers about what I'm doing wrong or another way to store my data. > > rndString <- function(n=1){rndString <- c(1:n);for(i in 1:n){rndString[i] <- paste(sample(c(0:9,letters,LETTERS),sample(c(3:20 ),1),replace=TRUE),collapse="")};return(rndString)} > library(rhdf5) > n <- 1000000 > d <- data.frame(id=seq(n),name=rndString(n),val=rnorm(n),stringsAsFa ctors=FALSE) > h5createFile("test.h5") > h5write(d,file="test.h5",name="d") > dd <- h5read("test.h5",name="d") > > John Estrada > > > > -- output of sessionInfo(): > >> sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] rhdf5_2.6.0 > > loaded via a namespace (and not attached): > [1] zlibbioc_1.8.0 > > > -- > Sent via the guest posting facility at bioconductor.org.
ADD COMMENT

Login before adding your answer.

Traffic: 530 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6