Question: GENESIS pcrelate() write gds file doesn't work
0
7 months ago by
o.giannakopoulou0 wrote:

Hello,

I'm trying to save the output of pcrelate in gds format using the write.as.gds=TRUE in my command but it doesn't work. I'm getting the following error: Error in .local(gdsobj, ...) : unused argument (write.to.gds = TRUE)

I have installed GENESIS following the Bioconductor instructions: if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("GENESIS", version = "3.8")

However when I do the sessionInfo() in R 2.5.0, it shows that the attached package is the GENESIS_2.12.4. I don't know if that makes any difference.

With write.to.gds=FALSE the command works fine but I have a quite large dataset so I would like to save it as gds.

Best regards Olga

gds genesis pcrelate • 217 views
modified 7 months ago by Stephanie M. Gogarten720 • written 7 months ago by o.giannakopoulou0

Is that really R 2.5.0? That is 2007!

I just noticed that reply. This was a typo. I was using R/3.5

Is that really R 2.5.0? That is 2007!

Answer: GENESIS pcrelate() write gds file doesn't work
0
7 months ago by
University of Washington
Stephanie M. Gogarten720 wrote:

The write.to.gds argument is no longer an option for pcrelate, because the output format has changed. Previously NxN matrices of kinship and IBD sharing coefficients were returned (or written to GDS) as matrices, but this information is now returned as a pairwise table and can be transformed into a matrix (with options for sparsity) with the function pcrelateToMatrix. If you want to in turn save that matrix in a GDS file, you can use the mat2gds function, but that will result in a much larger file than the sparse format provided by the Matrix package.

We have been working with a dataset of ~100,000 samples, and found that the previous version of pcrelate was unable to handle that many samples in any reasonable amount of time (and writing incrementally to GDS files was part of the problem). The best solution for very large sample sizes seems to be to run sample blocks in parallel. Currently that is not documented (as we're still working it out), but the next release of GENESIS will have options for using the functions that are currently internal to pcrelate independently for best performance in large datasets.

Thank you Stephanie for the swift reply. I am a new user and I had found that option in an older genesis vignette maybe. Just to confirm, the two pcrelate outputs in the current version are the data.frames. kinBtwn and kinSelf, right?

Many thanks again for the help Olga

Yes, that is correct.

Thanks Stephanie for the confirmation. I'm working on a cluster with limitations in running time so I'm having some problems to set up the pipeline and I was wondering if you could give me any advice. After some attempts, I have managed to run the PC-Relate command in my dataset and I have saved the two data.frames (mypcrelate$kinBtwn and mypcrelate$kinSelf). I'm not sure how I can create the pcrelateToMatrix with these two files as input though. I'm interested in using the PC-Relate output as input for PC-Air. Since I'm not sure how to recreate the mypcrelate matrix I have tried to create a KINGmat with these files using the "kingToMatrix" and the two files as .kin0 and .kin. However, it didn't work as it gives me an error "Error in FUN(X[[i]], ...) : Input is empty or only contains BOM or terminal control characters". Any advice would be more than welcomed since I am really kin in running the GENESIS pipeline for my non European dataset both for PCs and relatedness estimation.

pcrelateToMatrix takes the entire output object of pcrelate (the list of two data.frames) as its first argument:

pcmat <- pcrelateToMatrix(mypcrelate)


Thanks for the amazing support

For technical reasons I have to use R-3.4.1 version and GENESIS_2.8.1 instead of the new version. So I have to adjust my script to my previous version but I have been stuck. I have calculated the KING-robust estimates using the snpgdsIBDKING function from the SNPRelate package. However when I try to create the king matrix in this way:

KINGmat <- king2mat(file.kin0 = king$IBS0, file.kin = king$kinship, iids = iids)

It gives an "Error in read.table(file.kin0, header = TRUE) : 'file' must be a character string or connection"

Many thanks for the great help again

For technical reasons I have to use R-3.4.1 version and GENESIS_2.8.1 instead of the new version. So I have to adjust my script to my previous version but I have been stuck. I have calculated the KING-robust estimates using the snpgdsIBDKING function from the SNPRelate package. However when I try to create the king matrix in this way:

KINGmat <- king2mat(file.kin0 = king$IBS0, file.kin = king$kinship, iids = iids)

It gives an "Error in read.table(file.kin0, header = TRUE) : 'file' must be a character string or connection"

Many thanks for the great help again

You only need to use king2mat to import results from the command-line version of KING (where file.kin0 and file.kin are the paths to text files output by that software). If you are using snpgdsIBDKING, you already have the matrix in king$kinship. ADD REPLYlink written 7 months ago by Stephanie M. Gogarten720 Many thanks for the help. I had used the king$kinship but initially it was not working since the colnames and rownames are different. I changed them and is working fine now