I am trying to use the crlmm package to import and analyze idat data from the Infinium CytoSNP-850K v1.2 BeadChip platform. Since there is no annotation package for it, I am using the option 'nopackage' and importing the annotation from an adapted manifest file. However, it seems I have an issue in the "gender" verification step (most probably with the YIndex), which blocks the import of the data. Here is the commands I am using:
library(crlmm)
library(ff)
samplesheet = read.csv("samplesTest.csv", header=TRUE, as.is=TRUE)
anno<-data.frame(read.csv("Manifest.850K.csv", header=TRUE, as.is=TRUE))
arrayNames <- file.path(datadir, unique(samplesheet[, "SentrixPosition"]))
arrayInfo <- list(barcode=NULL, position="SentrixPosition")
batch <- rep("1", nrow(samplesheet))
cnSet <- crlmmIllumina(sampleSheet=samplesheet, gender=samplesheet$gender, arrayNames=arrayNames, arrayInfoColNames=arrayInfo, highDensity=TRUE, call.method='krlmm', cdfName="nopackage", batch=batch, anno=anno, genome="hg19", copynumber=TRUE, nopackage.norm="loess", verbose=TRUE)
Here is the head of the annotation file:
chromosome position featureNames isSnp IlmnID IlmnStrand SNP AddressA_ID AlleleA_ProbeSeq AddressB_ID AlleleB_ProbeSeq GenomeBuild chr MapInfo Ploidy Species Source SourceVersion SourceStrand
1 157255396 rs1000073 TRUE rs1000073-131_T_R_1893958059 TOP [A/G] 95804186 ACCTAGCACTATTTTCTAGTGCTCCATCTCTTAGCAGGGACTCTGTTCAG
37 1 157255396 diploid Homo sapiens dbSNP 131 BOT
1 114471189 rs1000528 TRUE rs1000528-138_B_R_2276261971 BOT [T/C] 89693853 AAAGCCAAATGACTTCCCTTAAAAGGTACTTCAGCGCATTTTACACAAAT
37 1 114471189 diploid Homo sapiens dbSNP 138 TOP
4 190223523 rs10006955 TRUE rs10006955-131_B_F_1893959315 BOT [T/C] 5731264 CCTGCCCCCCTCCACCCCGATCTTGGTCTAGTTTTAGCCATATCACTTGT
37 4 190223523 diploid Homo sapiens dbSNP 131 BOT
4 153765098 rs10007643 TRUE rs10007643-138_B_F_2276262148 BOT [T/C] 98627830 ACACGGGATTTGTGCCCTCCCCTGACTTGTGGCCAGGAGGCTTCTACCAC
37 4 153765098 diploid Homo sapiens dbSNP 138 BOT
4 182218143 rs10018479 TRUE rs10018479-138_T_R_2276262914 TOP [A/C] 64751192 ACTTCAGGCCAAAAAAGCACAGAGATACAAAAGACATGACAATATCCCTG
37 4 182218143 diploid Homo sapiens dbSNP 138 BOT
4 25119575 rs10018563 TRUE rs10018563-138_B_F_2276262922 BOT [T/C] 20725290 GTGGTCATGGGCCAGCAGTGTGGGCACGCCCTAGGGATTTGCTAGAGATG
37 4 25119575 diploid Homo sapiens dbSNP 138 BOT
4 182048302 rs10020564 TRUE rs10020564-138_B_F_2276263071 BOT [T/C] 95780854 GAGAGAATGCACCACAAGAACAAGCAAATTGAATGTAGTGACAAACAGAG
37 4 182048302 diploid Homo sapiens dbSNP 138 BOT
4 123095703 rs10021037 TRUE rs10021037-138_T_R_2276263103 TOP [A/G] 81660971 GAAAGAACAGAGAAAGAGAAGGAACCTGTTATAGGAAGGAAAAAACAGCA
37 4 123095703 diploid Homo sapiens dbSNP 138 BOT
4 85281687 rs10021127 TRUE rs10021127-138_T_R_2276263109 TOP [A/C] 33720106 GTTGGATATCTACTATGTGATTAAAAAAAACGCATATATAACCACAGGCA
37 4 85281687 diploid Homo sapiens dbSNP 138 BOT
and this is the output of the analysis if I include the gender information:
Instantiate CNSet container.
path arg not set. Assuming files are in local directory, or that complete path is provided
Initializing container for genotyping and copy number estimation
Processing sample stratum 1 of 1
'path' arg not set. Assuming files are in local directory, or that complete path is provided in 'arrayNames'
Finished preprocessing.
Begin genotyping...
Start computing log-ratios
-- Processing segment 1 out of 3
-- Processing segment 2 out of 3
-- Processing segment 3 out of 3
Leaving out non-variant SNPs
Start calculating 3-cluster parameters
Done calculating 3-cluster parameters
Start calculating 2-cluster parameters
Done calculating 2-cluster parameters
Start calculating 1-cluster parameters
Done calculating 1-cluster parameters
Done calculating platform-specific coefficients to predict number of clusters
Start predicting number of clusters
Done predicting number of clusters
Start assigning calls
-- Processing segment 1 out of 2
-- Processing segment 2 out of 2
Done assigning calls
Start computing confidence scores
-- Processing segment 1 out of 2
-- Processing segment 2 out of 2
Done computing confidence scores
Start imputing gender
Start computing average log-intensities
-- Processing segment 1 out of 2
-- Processing segment 2 out of 2
Done computing average log-intensities
Error in apply(Sy, 1, function(x) { : dim(X) must have a positive length
or if I don't include the gender information:
.............as above
Start computing confidence scores
-- Processing segment 1 out of 2
-- Processing segment 2 out of 2
Done computing confidence scores
Start verifying SNPs on Chromosome Y
Error in callsChrY[, male] : incorrect number of dimensions
An idea of what am I doing wrong?
Many thanks for your help,