Question: Write gds in parallel
0
gravatar for Vinicius Henrique da Silva
11 months ago by
Brazil

I am trying to write values at a certain GDS node in parallel. I was not able to figure out what I am doing wrong. I have tried the following:

​# Load data
data(hapmap_geno)
# Create a gds file
snpgdsCreateGeno("test.gds", genmat = hapmap_geno$genotype,
    sample.id = hapmap_geno$sample.id, snp.id = hapmap_geno$snp.id,
    snp.chromosome = hapmap_geno$snp.chromosome,
    snp.position = hapmap_geno$snp.position,
    snp.allele = hapmap_geno$snp.allele, snpfirstdim=TRUE)

# Open the GDS file
(genofile <- snpgdsOpen("test.gds", allow.fork=TRUE, readonly=FALSE))

snps.included <- read.gdsn(index.gdsn(genofile, "snp.id"))
all.samples <- read.gdsn(index.gdsn(genofile, "sample.id"))

LRR.matrix <- matrix(0, nrow = length(snps.included),
                      ncol = length(all.samples))

nLRR <- gdsfmt::add.gdsn(genofile, "LRR", LRR.matrix, replace=TRUE)
gdsfmt::read.gdsn(nLRR)

lrr.import <- function(lo, genofile, snps.included, nLRR){
message(lo)
LRR.x <- rnorm(length(snps.included))
gdsfmt::write.gdsn(nLRR, LRR.x, start=c(1,lo), count=c(length(snps.included),1))

}
### Working
multicoreParam <-  BiocParallel::MulticoreParam(workers = 1)
message("Start the parallel import of LRR/BAF values")
BiocParallel::bplapply(1:length(all.samples), lrr.import, BPPARAM = multicoreParam, genofile=genofile,
                           snps.included=snps.included, nLRR=nLRR)

### NOT Working
multicoreParam <-  BiocParallel::MulticoreParam(workers = 3)
message("Start the parallel import of LRR/BAF values")
BiocParallel::bplapply(1:length(all.samples), lrr.import, BPPARAM = multicoreParam, genofile=genofile,
                           snps.included=snps.included, nLRR=nLRR)

 

Error: BiocParallel errors
  element index: 1, 2, 3, 4, 5, 6, ...
  first error: Creating/Opening and writing the GDS file should be in the same process.

 

I would be grateful for some help.

biocparallel snprelate gds • 233 views
ADD COMMENTlink modified 11 months ago by zhengx20 • written 11 months ago by Vinicius Henrique da Silva30
Answer: Write gds in parallel
0
gravatar for zhengx
11 months ago by
zhengx20
United States
zhengx20 wrote:

The answer is simple: I have not implemented GDS writing with multiple processes at the same time so far.

We usually create multiple independent GDS files, and each process writes data to each separate GDS file. After all child processes are done, multiple GDS files are merged in the main process.

 

ADD COMMENTlink written 11 months ago by zhengx20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 163 users visited in the last hour