Hi,
I have a huge dataset of 448 samples * 10053 sips that I am trying to read in as a snpMatrix. I converted the data to be read in as a long format. This is how it looks like currently :
> head(forLD)
V1 V2 V3 V4
1 snp1 ALT_ALT771 1 1
2 snp1 ALT_ALT772 NA 1
3 snp1 ALT_ALT773 0 1
4 snp1 ALT_ALT774 0 1
5 snp1 ALT_ALT775 2 1
6 snp1 ALT_ALT776 0 1
I Have also tried replacing NA with 0 too but still doesn't work.
> samples<-unique(forLD[ ,2])
> ids<-unique(forLD[ ,1])
> geno1 <- read.long(forLD,ids,samples,fields=c(snp=1, sample=2, genotype=3, confidence=4),gcodes=c("0", "1", "2"),threshold=0.95)
Error in file(file[1], open = "rt") : invalid 'description' argument
Any idea on what is going wrong here and how to convert to a snpMatrix.
sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.6 (Final)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] snpStats_1.20.0 Matrix_1.2-8 survival_2.40-1
[4] BiocInstaller_1.20.3 reshape_0.8.6 data.table_1.10.4
loaded via a namespace (and not attached):
[1] zlibbioc_1.16.0 plyr_1.8.4 parallel_3.2.3
[4] tools_3.2.3 Rcpp_0.12.9 splines_3.2.3
[7] grid_3.2.3 BiocGenerics_0.16.1 lattice_0.20-34
How did you create the
forLD
object? It looks like you have read a file in and saved asforLD
; which looks like a data.frame from the above implementation. Theread.long
function appears to take in a file not a data.frame, so you could try reading in the file directly. If you have done something different could you please provide more of the code?