Question: unable to convert data to snpMatrix using read.long
3 months ago by
menonm20 wrote:


I have a huge dataset of 448 samples * 10053 sips that I am trying to read in as a snpMatrix. I converted the data to be read in as a long format. This is how it looks like currently :                                                                                                      

> head(forLD)
    V1         V2 V3 V4
1 snp1 ALT_ALT771  1  1
2 snp1 ALT_ALT772  NA  1
3 snp1 ALT_ALT773  0  1
4 snp1 ALT_ALT774  0  1
5 snp1 ALT_ALT775  2  1
6 snp1 ALT_ALT776  0  1                                                                                                                                                            

I Have also tried replacing NA with 0 too but still doesn't work.                                                                                                                              

> samples<-unique(forLD[ ,2])

> ids<-unique(forLD[ ,1])

> geno1 <- read.long(forLD,ids,samples,fields=c(snp=1, sample=2, genotype=3, confidence=4),gcodes=c("0", "1", "2"),threshold=0.95)

Error in file(file[1], open = "rt") : invalid 'description' argument

  Any idea on what is going wrong here and how to convert  to a snpMatrix.

R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.6 (Final)

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] snpStats_1.20.0      Matrix_1.2-8         survival_2.40-1
[4] BiocInstaller_1.20.3 reshape_0.8.6        data.table_1.10.4

loaded via a namespace (and not attached):
[1] zlibbioc_1.16.0     plyr_1.8.4          parallel_3.2.3
[4] tools_3.2.3         Rcpp_0.12.9         splines_3.2.3
[7] grid_3.2.3          BiocGenerics_0.16.1 lattice_0.20-34




How did you create the forLD object?  It looks like you have read a file in and saved as forLD; which looks like a data.frame from the above implementation. The read.long function appears to take in a file not a data.frame, so you could try reading in the file directly.  If you have done something different could you please provide more of the code?

written 12 weeks ago by shepherl ♦♦ 240
