Write gds in parallel
1
0
Entering edit mode
@vinicius-henrique-da-silva-6713
Last seen 18 months ago
Brazil

I am trying to write values at a certain GDS node in parallel. I was not able to figure out what I am doing wrong. I have tried the following:

​# Load data
data(hapmap_geno)
# Create a gds file
snpgdsCreateGeno("test.gds", genmat = hapmap_geno$genotype,
    sample.id = hapmap_geno$sample.id, snp.id = hapmap_geno$snp.id,
    snp.chromosome = hapmap_geno$snp.chromosome,
    snp.position = hapmap_geno$snp.position,
    snp.allele = hapmap_geno$snp.allele, snpfirstdim=TRUE)

# Open the GDS file
(genofile <- snpgdsOpen("test.gds", allow.fork=TRUE, readonly=FALSE))

snps.included <- read.gdsn(index.gdsn(genofile, "snp.id"))
all.samples <- read.gdsn(index.gdsn(genofile, "sample.id"))

LRR.matrix <- matrix(0, nrow = length(snps.included),
                      ncol = length(all.samples))

nLRR <- gdsfmt::add.gdsn(genofile, "LRR", LRR.matrix, replace=TRUE)
gdsfmt::read.gdsn(nLRR)

lrr.import <- function(lo, genofile, snps.included, nLRR){
message(lo)
LRR.x <- rnorm(length(snps.included))
gdsfmt::write.gdsn(nLRR, LRR.x, start=c(1,lo), count=c(length(snps.included),1))

}
### Working
multicoreParam <-  BiocParallel::MulticoreParam(workers = 1)
message("Start the parallel import of LRR/BAF values")
BiocParallel::bplapply(1:length(all.samples), lrr.import, BPPARAM = multicoreParam, genofile=genofile,
                           snps.included=snps.included, nLRR=nLRR)

### NOT Working
multicoreParam <-  BiocParallel::MulticoreParam(workers = 3)
message("Start the parallel import of LRR/BAF values")
BiocParallel::bplapply(1:length(all.samples), lrr.import, BPPARAM = multicoreParam, genofile=genofile,
                           snps.included=snps.included, nLRR=nLRR)

 

Error: BiocParallel errors
  element index: 1, 2, 3, 4, 5, 6, ...
  first error: Creating/Opening and writing the GDS file should be in the same process.

 

I would be grateful for some help.

gds SNPRelate BiocParallel • 539 views
ADD COMMENT
0
Entering edit mode
zhengx ▴ 30
@zhengx-7950
Last seen 2.2 years ago
United States

The answer is simple: I have not implemented GDS writing with multiple processes at the same time so far.

We usually create multiple independent GDS files, and each process writes data to each separate GDS file. After all child processes are done, multiple GDS files are merged in the main process.

 

ADD COMMENT

Login before adding your answer.

Traffic: 317 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6