Question

crlmm : copy number and genotyping of Illumina data

0

Entering edit mode

Abhishek Pratap ▴ 160

@abhishek-pratap-6167

Last seen 9.6 years ago

Hi Matt As it happens this got on the back burner earlier and I am getting back to it now. I am getting similar errors trying to do the CNV analysis on Illumina Omni5 idat files. I would be more than happy to share subset of data with you. Just wondering if you would have time now to help with this. Depending on the subset of idat files I choose I am getting different errors. Example: #1 leaving out novariant SNPsStart calculating Prior MeansError in calculatePriorValues(M, numSNP, verbose) : could not find function "makeCluster" #2 Instantiate CNSet container.Initializing container for genotyping and copy number estimationProcessing sample stratum 1 of 1Quantile normalizing 12 arrays, one at a time. |===================================================================== =====| 100%Calibrating 12 arrays. |================== | 25%Error in chol.default(crossprod(sweep(matS, 1, z[, 1], FUN = "*"), matS)) : the leading minor of order 1 is not positive definiteIn addition: Warning messages:1: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6440_8116121004_R04C01_Grn.idat and 6440_8116121004_R04C01_Red.idat3: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6449_8116121005_R01C01_Grn.idat and 6449_8116121005_R01C01_Red.idat Thanks! -Abhi On Tue, Feb 18, 2014 at 8:28 PM, Matt Ritchie <mritchie@wehi.edu.au> wrote: > Dear Abhi, > > You're using the appropriate function. Judging by the warning message, it > looks like one of your samples is not like the others (i.e. sample > 6431_8116121004 is not Omni5 version 1b). > > Maybe try leaving that one out and re-running? If that doesn't help, > perhaps you could put the idat files for this test set of 5 arrays online > so that I can take a closer look. > > Best wishes, > > Matt > > ----- Original Message ----- > From: "Abhishek Pratap" <apratap@sagebase.org> > To: bioconductor@r-project.org > Sent: Tuesday, 18 February, 2014 11:01:38 AM > Subject: [BioC] crlmm : copy number and genotyping of Illumina data > > Hi All > > I am trying to use crlmm package for doing the genotyping and CNV > analysis on a set ~200 samples genotyped on Illumina Omni5 array. > > > I tried following the vignette (seems a bit dated) and got some > errors(see below) > > http://www.bioconductor.org/packages/release/bioc/vignettes/crlmm/in st/doc/IlluminaPreprocessCN.pdf > > Also searching a bit more I found multiple functions in the code of > crlmm like (genotype.Illumina, crlmmIlluminav2 etc) which seem to be > doing similar stuff. > > Just wondering if someone can point me to the latest recipe(if any) of > reading in the idat files (dual channel) and do the basic genotyping > calling + copy number analysis. > > > here is what I have done for a test case (5 arrays) > > > > cnSet <- genotype.Illumina(sampleSheet = samplesheet[1:5,], > + arrayNames = arrayNames[1:5], > + arrayInfoColNames = > list(barcode="SentrixBarcode_A", position="SentrixPosition_A"), > + path = datadir, > + copynumber = T, > + batch = samplesheet$Sample_Group[1:5], > + cdfName = "humanomni5quadv1b", > + call.method = "krlmm", > + verbose=T > + ) > Instantiate CNSet container. > Initializing container for genotyping and copy number estimation > reading > /work/DAT_118__AML/Analysis/dset1/CNV/data/5396_6298080101_R02C01_Gr n.idat > reading > /work/DAT_118__AML/Analysis/dset1/CNV/data/5405_6298080103_R03C01_Gr n.idat > reading > /work/DAT_118__AML/Analysis/dset1/CNV/data/5414_6298098003_R04C01_Gr n.idat > reading > /work/DAT_118__AML/Analysis/dset1/CNV/data/6431_8116121004_R03C01_Gr n.idat > reading > /work/DAT_118__AML/Analysis/dset1/CNV/data/5423_6762372017_R01C01_Gr n.idat > Processing sample stratum 1 of 1 > > Loading chip annotation information. > Loading reference normalization information. > Quantile normalizing 5 arrays, one at a time. > > |=================================================================== ============================| > 100% > Loading snp annotation and mixture model parameters. > Calibrating 5 arrays. > |========================================================= > | 60% > Error in quantile.default(M, c(1, 5)/6, names = FALSE) : > missing values and NaN's not allowed if 'na.rm' is FALSE > > In addition: Warning messages: > 1: In getProtocolData.Illumina(grnidats, sep = sep, fileExt = > fileExt$green, : > Chips are not of the same type. Skipping 6431_8116121004_R03C01_Grn.idat > 2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = > arrayNames[sel], : > Chips are not of the same type. Skipping > 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat > > > > > Thanks! > -Abhi > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:10}}

SNP Annotation Normalization crlmm copynumber SNP Annotation Normalization crlmm • 2.5k views

ADD COMMENT • link updated 9.9 years ago by Matthew Ritchie ▴ 1000 • written 9.9 years ago by Abhishek Pratap ▴ 160

score 0 · Answer 1 · 2014-06-04

0

Entering edit mode

Matthew Ritchie ▴ 1000

@matthew-ritchie-650

Last seen 20 months ago

Australia

Dear Abhi, Can you provide your sessionInfo()? Thanks, Matt ----- Original Message ----- From: "Abhishek Pratap" <apratap@sagebase.org> To: "Matt Ritchie" <mritchie@wehi.edu.au> Cc: bioconductor@r-project.org Sent: Tuesday, 3 June, 2014 1:36:20 PM Subject: Re: [BioC] crlmm : copy number and genotyping of Illumina data Hi Matt As it happens this got on the back burner earlier and I am getting back to it now. I am getting similar errors trying to do the CNV analysis on Illumina Omni5 idat files. I would be more than happy to share subset of data with you. Just wondering if you would have time now to help with this. Depending on the subset of idat files I choose I am getting different errors. Example: #1 leaving out novariant SNPs Start calculating Prior Means Error in calculatePriorValues(M, numSNP, verbose) : could not find function "makeCluster" #2 Instantiate CNSet container. Initializing container for genotyping and copy number estimation Processing sample stratum 1 of 1 Quantile normalizing 12 arrays, one at a time. |=============================== ===========================================| 100% Calibrating 12 arrays. |================== | 25% Error in chol.default(crossprod(sweep(matS, 1, z[, 1], FUN = "*"), matS)) : the leading minor of order 1 is not positive definite In addition: Warning messages: 1: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat 2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6440_8116121004_R04C01_Grn.idat and 6440_8116121004_R04C01_Red.idat 3: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6449_8116121005_R01C01_Grn.idat and 6449_8116121005_R01C01_Red.idat Thanks! -Abhi On Tue, Feb 18, 2014 at 8:28 PM, Matt Ritchie < mritchie@wehi.edu.au > wrote: Dear Abhi, You're using the appropriate function. Judging by the warning message, it looks like one of your samples is not like the others (i.e. sample 6431_8116121004 is not Omni5 version 1b). Maybe try leaving that one out and re-running? If that doesn't help, perhaps you could put the idat files for this test set of 5 arrays online so that I can take a closer look. Best wishes, Matt ----- Original Message ----- From: "Abhishek Pratap" < apratap@sagebase.org > To: bioconductor@r-project.org Sent: Tuesday, 18 February, 2014 11:01:38 AM Subject: [BioC] crlmm : copy number and genotyping of Illumina data Hi All I am trying to use crlmm package for doing the genotyping and CNV analysis on a set ~200 samples genotyped on Illumina Omni5 array. I tried following the vignette (seems a bit dated) and got some errors(see below) http://www.bioconductor.org/packages/release/bioc/vignettes/crlmm/inst /doc/IlluminaPreprocessCN.pdf Also searching a bit more I found multiple functions in the code of crlmm like (genotype.Illumina, crlmmIlluminav2 etc) which seem to be doing similar stuff. Just wondering if someone can point me to the latest recipe(if any) of reading in the idat files (dual channel) and do the basic genotyping calling + copy number analysis. here is what I have done for a test case (5 arrays) > cnSet <- genotype.Illumina(sampleSheet = samplesheet[1:5,], + arrayNames = arrayNames[1:5], + arrayInfoColNames = list(barcode="SentrixBarcode_A", position="SentrixPosition_A"), + path = datadir, + copynumber = T, + batch = samplesheet$Sample_Group[1:5], + cdfName = "humanomni5quadv1b", + call.method = "krlmm", + verbose=T + ) Instantiate CNSet container. Initializing container for genotyping and copy number estimation reading /work/DAT_118__AML/Analysis/dset1/CNV/data/5396_6298080101_R02 C01_Grn.idat reading /work/DAT_118__AML/Analysis/dset1/CNV/data/5405_6298080103_R03 C01_Grn.idat reading /work/DAT_118__AML/Analysis/dset1/CNV/data/5414_6298098003_R04 C01_Grn.idat reading /work/DAT_118__AML/Analysis/dset1/CNV/data/6431_8116121004_R03 C01_Grn.idat reading /work/DAT_118__AML/Analysis/dset1/CNV/data/5423_6762372017_R01 C01_Grn.idat Processing sample stratum 1 of 1 Loading chip annotation information. Loading reference normalization information. Quantile normalizing 5 arrays, one at a time. |===================================================================== ==========================| 100% Loading snp annotation and mixture model parameters. Calibrating 5 arrays. |========================================================= | 60% Error in quantile.default(M, c(1, 5)/6, names = FALSE) : missing values and NaN's not allowed if 'na.rm' is FALSE In addition: Warning messages: 1: In getProtocolData.Illumina(grnidats, sep = sep, fileExt = fileExt$green, : Chips are not of the same type. Skipping 6431_8116121004_R03C01_Grn.idat 2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat Thanks! -Abhi ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:17}}

ADD COMMENT • link 9.9 years ago Matthew Ritchie ▴ 1000

0

Entering edit mode

Here it is. hopefully the formatting is not that bad -A > sessionInfo()R version 3.1.0 (2014-04-10) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C LC_COLLATE=C [5] LC_MONETARY=C LC_MESSAGES=C LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C attached base packages: [1] tools grid parallel stats graphics grDevices utils datasets methods base other attached packages: [1] synapseClient_1.2-1 affy_1.42.2 frma_1.16.0 [4] GEOquery_2.30.0 Biobase_2.24.0 RColorBrewer_1.0-5 [7] gplots_2.13.0 samr_2.0 matrixStats_0.8.14 [10] impute_1.38.1 ggplot2_1.0.0 VennDiagram_1.6.5 [13] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1 [16] GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.7 [19] BiocGenerics_0.10.0 illuminaio_0.6.0 gdata_2.13.3 [22] ff_2.2-13 bit_1.1-12 humanomni5quadv1bCrlmm_1.0.0 [25] crlmm_1.22.0 preprocessCore_1.26.1 oligoClasses_1.26.0 [28] BiocInstaller_1.14.2 loaded via a namespace (and not attached): [1] AnnotationDbi_1.26.0 Biostrings_2.32.0 DBI_0.2-7 KernSmooth_2.23-12 [5] MASS_7.3-33 Matrix_1.1-3 R.methodsS3_1.6.1 RCurl_1.95-4.1 [9] RJSONIO_1.2-0.2 RSQLite_0.11.4 RcppEigen_0.3.2.1.2 VGAM_0.9-4 [13] XML_3.98-1.1 XVector_0.4.0 affxparser_1.36.0 affyio_1.32.0 [17] annotate_1.42.0 base64_1.1 bitops_1.0-6 caTools_1.17 [21] codetools_0.2-8 colorspace_1.2-4 digest_0.6.4 ellipse_0.3-8 [25] foreach_1.4.2 genefilter_1.46.1 geneplotter_1.42.0 gtable_0.1.2 [29] gtools_3.4.1 iterators_1.0.7 lattice_0.20-29 locfit_1.5-9.1 [33] munsell_0.4.2 mvtnorm_0.9-99992 oligo_1.28.2 plyr_1.8.1 [37] proto_0.3-10 reshape2_1.4 scales_0.2.4 splines_3.1.0 [41] stats4_3.1.0 stringr_0.6.2 survival_2.37-7 xtable_1.7-3 [45] zlibbioc_1.10.0 On Tue, Jun 3, 2014 at 4:59 PM, Matt Ritchie <mritchie@wehi.edu.au> wrote: > Dear Abhi, > > Can you provide your sessionInfo()? > > Thanks, > > Matt > ------------------------------ > *From: *"Abhishek Pratap" <apratap@sagebase.org> > *To: *"Matt Ritchie" <mritchie@wehi.edu.au> > *Cc: *bioconductor@r-project.org > *Sent: *Tuesday, 3 June, 2014 1:36:20 PM > *Subject: *Re: [BioC] crlmm : copy number and genotyping of Illumina data > > > Hi Matt > > As it happens this got on the back burner earlier and I am getting back to > it now. I am getting similar errors trying to do the CNV analysis on > Illumina Omni5 idat files. I would be more than happy to share subset of > data with you. Just wondering if you would have time now to help with this. > > > Depending on the subset of idat files I choose I am getting different > errors. > > Example: > > #1 > > leaving out novariant SNPsStart calculating Prior MeansError in calculatePriorValues(M, numSNP, verbose) : > could not find function "makeCluster" > > > #2 > > Instantiate CNSet container.Initializing container for genotyping and copy number estimationProcessing sample stratum 1 of 1Quantile normalizing 12 arrays, one at a time. |============================== ============================================| 100%Calibrating 12 arrays. |================== | 25%Error in chol.default(crossprod(sweep(matS, 1, z[, 1], FUN = "*"), matS)) : > the leading minor of order 1 is not positive definiteIn addition: Warning messages:1: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : > Chips are not of the same type. Skipping 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : > Chips are not of the same type. Skipping 6440_8116121004_R04C01_Grn.idat and 6440_8116121004_R04C01_Red.idat3: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : > Chips are not of the same type. Skipping 6449_8116121005_R01C01_Grn.idat and 6449_8116121005_R01C01_Red.idat > > > Thanks! > -Abhi > > > On Tue, Feb 18, 2014 at 8:28 PM, Matt Ritchie <mritchie@wehi.edu.au> > wrote: > >> Dear Abhi, >> >> You're using the appropriate function. Judging by the warning message, >> it looks like one of your samples is not like the others (i.e. sample >> 6431_8116121004 is not Omni5 version 1b). >> >> Maybe try leaving that one out and re-running? If that doesn't help, >> perhaps you could put the idat files for this test set of 5 arrays online >> so that I can take a closer look. >> >> Best wishes, >> >> Matt >> >> ----- Original Message ----- >> From: "Abhishek Pratap" <apratap@sagebase.org> >> To: bioconductor@r-project.org >> Sent: Tuesday, 18 February, 2014 11:01:38 AM >> Subject: [BioC] crlmm : copy number and genotyping of Illumina data >> >> Hi All >> >> I am trying to use crlmm package for doing the genotyping and CNV >> analysis on a set ~200 samples genotyped on Illumina Omni5 array. >> >> >> I tried following the vignette (seems a bit dated) and got some >> errors(see below) >> >> http://www.bioconductor.org/packages/release/bioc/vignettes/crlmm/i nst/doc/IlluminaPreprocessCN.pdf >> >> Also searching a bit more I found multiple functions in the code of >> crlmm like (genotype.Illumina, crlmmIlluminav2 etc) which seem to be >> doing similar stuff. >> >> Just wondering if someone can point me to the latest recipe(if any) of >> reading in the idat files (dual channel) and do the basic genotyping >> calling + copy number analysis. >> >> >> here is what I have done for a test case (5 arrays) >> >> >> > cnSet <- genotype.Illumina(sampleSheet = samplesheet[1:5,], >> + arrayNames = arrayNames[1:5], >> + arrayInfoColNames = >> list(barcode="SentrixBarcode_A", position="SentrixPosition_A"), >> + path = datadir, >> + copynumber = T, >> + batch = samplesheet$Sample_Group[1:5], >> + cdfName = "humanomni5quadv1b", >> + call.method = "krlmm", >> + verbose=T >> + ) >> Instantiate CNSet container. >> Initializing container for genotyping and copy number estimation >> reading >> /work/DAT_118__AML/Analysis/dset1/CNV/data/5396_6298080101_R02C01_G rn.idat >> reading >> /work/DAT_118__AML/Analysis/dset1/CNV/data/5405_6298080103_R03C01_G rn.idat >> reading >> /work/DAT_118__AML/Analysis/dset1/CNV/data/5414_6298098003_R04C01_G rn.idat >> reading >> /work/DAT_118__AML/Analysis/dset1/CNV/data/6431_8116121004_R03C01_G rn.idat >> reading >> /work/DAT_118__AML/Analysis/dset1/CNV/data/5423_6762372017_R01C01_G rn.idat >> Processing sample stratum 1 of 1 >> >> Loading chip annotation information. >> Loading reference normalization information. >> Quantile normalizing 5 arrays, one at a time. >> >> |================================================================== =============================| >> 100% >> Loading snp annotation and mixture model parameters. >> Calibrating 5 arrays. >> |========================================================= >> | 60% >> Error in quantile.default(M, c(1, 5)/6, names = FALSE) : >> missing values and NaN's not allowed if 'na.rm' is FALSE >> >> In addition: Warning messages: >> 1: In getProtocolData.Illumina(grnidats, sep = sep, fileExt = >> fileExt$green, : >> Chips are not of the same type. Skipping >> 6431_8116121004_R03C01_Grn.idat >> 2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = >> arrayNames[sel], : >> Chips are not of the same type. Skipping >> 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat >> >> >> >> >> Thanks! >> -Abhi >> >> ______________________________________________________________________ >> The information in this email is confidential and intended solely for the >> addressee. >> You must not disclose, forward, print or use it without the permission of >> the sender. >> ______________________________________________________________________ >> > > > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:10}}

ADD REPLY • link 9.9 years ago Abhishek Pratap ▴ 160

0

Entering edit mode

Hey Matt Here is the most recent error I got Instantiate CNSet container.Initializing container for genotyping and copy number estimationProcessing sample stratum 1 of 1Quantile normalizing 3 arrays, one at a time. |===================================================================== ==========================| 100%Calibrating 3 arrays. |===================================================================== ==========================| 100%Finished preprocessing.Preprocessing complete. Begin genotyping...Start computing log ratio -- Processing segment 1 out of 9 -- Processing segment 2 out of 9 -- Processing segment 3 out of 9 -- Processing segment 4 out of 9 -- Processing segment 5 out of 9 -- Processing segment 6 out of 9 -- Processing segment 7 out of 9 -- Processing segment 8 out of 9 -- Processing segment 9 out of 9Done computing log ratioleaving out novariant SNPsStart calculating Prior MeansError in checkForRemoteErrors(val) : 8 nodes produced errors; first error: number of cluster centres must lie between 1 and nrow(x) On Thu, Jun 5, 2014 at 12:36 PM, Abhishek Pratap <apratap@sagebase.org> wrote: > Here it is. hopefully the formatting is not that bad > > -A > > > > sessionInfo()R version 3.1.0 (2014-04-10) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C LC_COLLATE=C > [5] LC_MONETARY=C LC_MESSAGES=C LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C > > attached base packages: > [1] tools grid parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] synapseClient_1.2-1 affy_1.42.2 frma_1.16.0 > [4] GEOquery_2.30.0 Biobase_2.24.0 RColorBrewer_1.0-5 > [7] gplots_2.13.0 samr_2.0 matrixStats_0.8.14 > [10] impute_1.38.1 ggplot2_1.0.0 VennDiagram_1.6.5 > [13] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1 > [16] GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.7 > [19] BiocGenerics_0.10.0 illuminaio_0.6.0 gdata_2.13.3 > [22] ff_2.2-13 bit_1.1-12 humanomni5quadv1bCrlmm_1.0.0 > [25] crlmm_1.22.0 preprocessCore_1.26.1 oligoClasses_1.26.0 > [28] BiocInstaller_1.14.2 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.26.0 Biostrings_2.32.0 DBI_0.2-7 KernSmooth_2.23-12 > [5] MASS_7.3-33 Matrix_1.1-3 R.methodsS3_1.6.1 RCurl_1.95-4.1 > [9] RJSONIO_1.2-0.2 RSQLite_0.11.4 RcppEigen_0.3.2.1.2 VGAM_0.9-4 > [13] XML_3.98-1.1 XVector_0.4.0 affxparser_1.36.0 affyio_1.32.0 > [17] annotate_1.42.0 base64_1.1 bitops_1.0-6 caTools_1.17 > [21] codetools_0.2-8 colorspace_1.2-4 digest_0.6.4 ellipse_0.3-8 > [25] foreach_1.4.2 genefilter_1.46.1 geneplotter_1.42.0 gtable_0.1.2 > [29] gtools_3.4.1 iterators_1.0.7 lattice_0.20-29 locfit_1.5-9.1 > [33] munsell_0.4.2 mvtnorm_0.9-99992 oligo_1.28.2 plyr_1.8.1 > [37] proto_0.3-10 reshape2_1.4 scales_0.2.4 splines_3.1.0 > [41] stats4_3.1.0 stringr_0.6.2 survival_2.37-7 xtable_1.7-3 > [45] zlibbioc_1.10.0 > > > > On Tue, Jun 3, 2014 at 4:59 PM, Matt Ritchie <mritchie@wehi.edu.au> wrote: > >> Dear Abhi, >> >> Can you provide your sessionInfo()? >> >> Thanks, >> >> Matt >> ------------------------------ >> *From: *"Abhishek Pratap" <apratap@sagebase.org> >> *To: *"Matt Ritchie" <mritchie@wehi.edu.au> >> *Cc: *bioconductor@r-project.org >> *Sent: *Tuesday, 3 June, 2014 1:36:20 PM >> *Subject: *Re: [BioC] crlmm : copy number and genotyping of Illumina data >> >> >> Hi Matt >> >> As it happens this got on the back burner earlier and I am getting back >> to it now. I am getting similar errors trying to do the CNV analysis on >> Illumina Omni5 idat files. I would be more than happy to share subset of >> data with you. Just wondering if you would have time now to help with this. >> >> >> Depending on the subset of idat files I choose I am getting different >> errors. >> >> Example: >> >> #1 >> >> leaving out novariant SNPsStart calculating Prior MeansError in calculatePriorValues(M, numSNP, verbose) : >> could not find function "makeCluster" >> >> >> #2 >> >> Instantiate CNSet container.Initializing container for genotyping and copy number estimationProcessing sample stratum 1 of 1Quantile normalizing 12 arrays, one at a time. |============================== ============================================| 100%Calibrating 12 arrays. |================== | 25%Error in chol.default(crossprod(sweep(matS, 1, z[, 1], FUN = "*"), matS)) : >> the leading minor of order 1 is not positive definiteIn addition: Warning messages:1: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : >> Chips are not of the same type. Skipping 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : >> Chips are not of the same type. Skipping 6440_8116121004_R04C01_Grn.idat and 6440_8116121004_R04C01_Red.idat3: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : >> Chips are not of the same type. Skipping 6449_8116121005_R01C01_Grn.idat and 6449_8116121005_R01C01_Red.idat >> >> >> Thanks! >> -Abhi >> >> >> On Tue, Feb 18, 2014 at 8:28 PM, Matt Ritchie <mritchie@wehi.edu.au> >> wrote: >> >>> Dear Abhi, >>> >>> You're using the appropriate function. Judging by the warning message, >>> it looks like one of your samples is not like the others (i.e. sample >>> 6431_8116121004 is not Omni5 version 1b). >>> >>> Maybe try leaving that one out and re-running? If that doesn't help, >>> perhaps you could put the idat files for this test set of 5 arrays online >>> so that I can take a closer look. >>> >>> Best wishes, >>> >>> Matt >>> >>> ----- Original Message ----- >>> From: "Abhishek Pratap" <apratap@sagebase.org> >>> To: bioconductor@r-project.org >>> Sent: Tuesday, 18 February, 2014 11:01:38 AM >>> Subject: [BioC] crlmm : copy number and genotyping of Illumina data >>> >>> Hi All >>> >>> I am trying to use crlmm package for doing the genotyping and CNV >>> analysis on a set ~200 samples genotyped on Illumina Omni5 array. >>> >>> >>> I tried following the vignette (seems a bit dated) and got some >>> errors(see below) >>> >>> http://www.bioconductor.org/packages/release/bioc/vignettes/crlmm/ inst/doc/IlluminaPreprocessCN.pdf >>> >>> Also searching a bit more I found multiple functions in the code of >>> crlmm like (genotype.Illumina, crlmmIlluminav2 etc) which seem to be >>> doing similar stuff. >>> >>> Just wondering if someone can point me to the latest recipe(if any) of >>> reading in the idat files (dual channel) and do the basic genotyping >>> calling + copy number analysis. >>> >>> >>> here is what I have done for a test case (5 arrays) >>> >>> >>> > cnSet <- genotype.Illumina(sampleSheet = samplesheet[1:5,], >>> + arrayNames = arrayNames[1:5], >>> + arrayInfoColNames = >>> list(barcode="SentrixBarcode_A", position="SentrixPosition_A"), >>> + path = datadir, >>> + copynumber = T, >>> + batch = samplesheet$Sample_Group[1:5], >>> + cdfName = "humanomni5quadv1b", >>> + call.method = "krlmm", >>> + verbose=T >>> + ) >>> Instantiate CNSet container. >>> Initializing container for genotyping and copy number estimation >>> reading >>> /work/DAT_118__AML/Analysis/dset1/CNV/data/5396_6298080101_R02C01_ Grn.idat >>> reading >>> /work/DAT_118__AML/Analysis/dset1/CNV/data/5405_6298080103_R03C01_ Grn.idat >>> reading >>> /work/DAT_118__AML/Analysis/dset1/CNV/data/5414_6298098003_R04C01_ Grn.idat >>> reading >>> /work/DAT_118__AML/Analysis/dset1/CNV/data/6431_8116121004_R03C01_ Grn.idat >>> reading >>> /work/DAT_118__AML/Analysis/dset1/CNV/data/5423_6762372017_R01C01_ Grn.idat >>> Processing sample stratum 1 of 1 >>> >>> Loading chip annotation information. >>> Loading reference normalization information. >>> Quantile normalizing 5 arrays, one at a time. >>> >>> |================================================================= ==============================| >>> 100% >>> Loading snp annotation and mixture model parameters. >>> Calibrating 5 arrays. >>> |========================================================= >>> | 60% >>> Error in quantile.default(M, c(1, 5)/6, names = FALSE) : >>> missing values and NaN's not allowed if 'na.rm' is FALSE >>> >>> In addition: Warning messages: >>> 1: In getProtocolData.Illumina(grnidats, sep = sep, fileExt = >>> fileExt$green, : >>> Chips are not of the same type. Skipping >>> 6431_8116121004_R03C01_Grn.idat >>> 2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = >>> arrayNames[sel], : >>> Chips are not of the same type. Skipping >>> 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat >>> >>> >>> >>> >>> Thanks! >>> -Abhi >>> >>> ______________________________________________________________________ >>> The information in this email is confidential and intended solely for >>> the addressee. >>> You must not disclose, forward, print or use it without the permission >>> of the sender. >>> ______________________________________________________________________ >>> >> >> >> >> ______________________________________________________________________ >> The information in this email is confidential and intended solely for the >> addressee. >> You must not disclose, forward, print or use it without the permission of >> the sender. >> ______________________________________________________________________ >> > > [[alternative HTML version deleted]]

ADD REPLY • link 9.9 years ago Abhishek Pratap ▴ 160

0

Entering edit mode

Hi Abhi, The issue is likely to do with your small (n=3) samples size (less than 8 and our new method, which uses k-means - http://www.biomedcentral.com/1471-2105/15/158/abstract will not be happy). We'll add an error message explaining this to the code. If adding a few mores samples doesn't help, perhaps you can provide us with a subset of your idat files (offline) so that we can have a go at reproducing this error. Cheers, Matt ----- Original Message ----- From: "Abhishek Pratap" <apratap@sagebase.org> To: "Matt Ritchie" <mritchie@wehi.edu.au> Cc: bioconductor@r-project.org Sent: Friday, 6 June, 2014 6:19:51 AM Subject: Re: [BioC] crlmm : copy number and genotyping of Illumina data Hey Matt Here is the most recent error I got Instantiate CNSet container. Initializing container for genotyping and copy number estimation Processing sample stratum 1 of 1 Quantile normalizing 3 arrays, one at a time. |================================ ===============================================================| 100% Calibrating 3 arrays. |=============================================== ================================================| 100% Finished preprocessing. Preprocessing complete. Begin genotyping... Start computing log ratio -- Processing segment 1 out of 9 -- Processing segment 2 out of 9 -- Processing segment 3 out of 9 -- Processing segment 4 out of 9 -- Processing segment 5 out of 9 -- Processing segment 6 out of 9 -- Processing segment 7 out of 9 -- Processing segment 8 out of 9 -- Processing segment 9 out of 9 Done computing log ratio leaving out novariant SNPs Start calculating Prior Means Error in checkForRemoteErrors(val) : 8 nodes produced errors; first error: number of cluster centres must lie between 1 and nrow(x) On Thu, Jun 5, 2014 at 12:36 PM, Abhishek Pratap < apratap@sagebase.org > wrote: Here it is. hopefully the formatting is not that bad -A > sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C LC_COLLATE=C [5] LC_MONETARY=C LC_MESSAGES=C LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C attached base packages: [1] tools grid parallel stats graphics grDevices utils datasets methods base other attached packages: [1] synapseClient_1.2-1 affy_1.42.2 frma_1.16.0 [4] GEOquery_2.30.0 Biobase_2.24.0 RColorBrewer_1.0-5 [7] gplots_2.13.0 samr_2.0 matrixStats_0.8.14 [10] impute_1.38.1 ggplot2_1.0.0 VennDiagram_1.6.5 [13] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1 [16] GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.7 [19] BiocGenerics_0.10.0 illuminaio_0.6.0 gdata_2.13.3 [22] ff_2.2-13 bit_1.1-12 humanomni5quadv1bCrlmm_1.0.0 [25] crlmm_1.22.0 preprocessCore_1.26.1 oligoClasses_1.26.0 [28] BiocInstaller_1.14.2 loaded via a namespace (and not attached): [1] AnnotationDbi_1.26.0 Biostrings_2.32.0 DBI_0.2-7 KernSmooth_2.23-12 [5] MASS_7.3-33 Matrix_1.1-3 R.methodsS3_1.6.1 RCurl_1.95-4.1 [9] RJSONIO_1.2-0.2 RSQLite_0.11.4 RcppEigen_0.3.2.1.2 VGAM_0.9-4 [13] XML_3.98-1.1 XVector_0.4.0 affxparser_1.36.0 affyio_1.32.0 [17] annotate_1.42.0 base64_1.1 bitops_1.0-6 caTools_1.17 [21] codetools_0.2-8 colorspace_1.2-4 digest_0.6.4 ellipse_0.3-8 [25] foreach_1.4.2 genefilter_1.46.1 geneplotter_1.42.0 gtable_0.1.2 [29] gtools_3.4.1 iterators_1.0.7 lattice_0.20-29 locfit_1.5-9.1 [33] munsell_0.4.2 mvtnorm_0.9-99992 oligo_1.28.2 plyr_1.8.1 [37] proto_0.3-10 reshape2_1.4 scales_0.2.4 splines_3.1.0 [41] stats4_3.1.0 stringr_0.6.2 survival_2.37-7 xtable_1.7-3 [45] zlibbioc_1.10.0 On Tue, Jun 3, 2014 at 4:59 PM, Matt Ritchie < mritchie@wehi.edu.au > wrote:

Dear Abhi, Can you provide your sessionInfo()? Thanks, Matt From: "Abhishek Pratap" < apratap@sagebase.org > To: "Matt Ritchie" < mritchie@wehi.edu.au > Cc: bioconductor@r-project.org Sent: Tuesday, 3 June, 2014 1:36:20 PM Subject: Re: [BioC] crlmm : copy number and genotyping of Illumina data Hi Matt As it happens this got on the back burner earlier and I am getting back to it now. I am getting similar errors trying to do the CNV analysis on Illumina Omni5 idat files. I would be more than happy to share subset of data with you. Just wondering if you would have time now to help with this. Depending on the subset of idat files I choose I am getting different errors. Example: #1 leaving out novariant SNPs Start calculating Prior Means Error in calculatePriorValues(M, numSNP, verbose) : could not find function "makeCluster" #2 Instantiate CNSet container. Initializing container for genotyping and copy number estimation Processing sample stratum 1 of 1 Quantile normalizing 12 arrays, one at a time. |=============================== ===========================================| 100% Calibrating 12 arrays. |================== | 25% Error in chol.default(crossprod(sweep(matS, 1, z[, 1], FUN = "*"), matS)) : the leading minor of order 1 is not positive definite In addition: Warning messages: 1: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat 2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6440_8116121004_R04C01_Grn.idat and 6440_8116121004_R04C01_Red.idat 3: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6449_8116121005_R01C01_Grn.idat and 6449_8116121005_R01C01_Red.idat Thanks! -Abhi On Tue, Feb 18, 2014 at 8:28 PM, Matt Ritchie < mritchie@wehi.edu.au > wrote:
Dear Abhi, You're using the appropriate function. Judging by the warning message, it looks like one of your samples is not like the others (i.e. sample 6431_8116121004 is not Omni5 version 1b). Maybe try leaving that one out and re-running? If that doesn't help, perhaps you could put the idat files for this test set of 5 arrays online so that I can take a closer look. Best wishes, Matt ----- Original Message ----- From: "Abhishek Pratap" < apratap@sagebase.org > To: bioconductor@r-project.org Sent: Tuesday, 18 February, 2014 11:01:38 AM Subject: [BioC] crlmm : copy number and genotyping of Illumina data Hi All I am trying to use crlmm package for doing the genotyping and CNV analysis on a set ~200 samples genotyped on Illumina Omni5 array. I tried following the vignette (seems a bit dated) and got some errors(see below) http://www.bioconductor.org/packages/release/bioc/vignettes/crlmm/inst /doc/IlluminaPreprocessCN.pdf Also searching a bit more I found multiple functions in the code of crlmm like (genotype.Illumina, crlmmIlluminav2 etc) which seem to be doing similar stuff. Just wondering if someone can point me to the latest recipe(if any) of reading in the idat files (dual channel) and do the basic genotyping calling + copy number analysis. here is what I have done for a test case (5 arrays) > cnSet <- genotype.Illumina(sampleSheet = samplesheet[1:5,], + arrayNames = arrayNames[1:5], + arrayInfoColNames = list(barcode="SentrixBarcode_A", position="SentrixPosition_A"), + path = datadir, + copynumber = T, + batch = samplesheet$Sample_Group[1:5], + cdfName = "humanomni5quadv1b", + call.method = "krlmm", + verbose=T + ) Instantiate CNSet container. Initializing container for genotyping and copy number estimation reading /work/DAT_118__AML/Analysis/dset1/CNV/data/5396_6298080101_R02 C01_Grn.idat reading /work/DAT_118__AML/Analysis/dset1/CNV/data/5405_6298080103_R03 C01_Grn.idat reading /work/DAT_118__AML/Analysis/dset1/CNV/data/5414_6298098003_R04 C01_Grn.idat reading /work/DAT_118__AML/Analysis/dset1/CNV/data/6431_8116121004_R03 C01_Grn.idat reading /work/DAT_118__AML/Analysis/dset1/CNV/data/5423_6762372017_R01 C01_Grn.idat Processing sample stratum 1 of 1 Loading chip annotation information. Loading reference normalization information. Quantile normalizing 5 arrays, one at a time. |===================================================================== ==========================| 100% Loading snp annotation and mixture model parameters. Calibrating 5 arrays. |========================================================= | 60% Error in quantile.default(M, c(1, 5)/6, names = FALSE) : missing values and NaN's not allowed if 'na.rm' is FALSE In addition: Warning messages: 1: In getProtocolData.Illumina(grnidats, sep = sep, fileExt = fileExt$green, : Chips are not of the same type. Skipping 6431_8116121004_R03C01_Grn.idat 2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat Thanks! -Abhi ______________________________________________________________________ The information in this email is confidential and intended solely for the addressee. You must not disclose, forward, print or use it without the permission of the sender. ______________________________________________________________________ ______________________________________________________________________ The information in this email is confidential and intended solely for the addressee. You must not disclose, forward, print or use it without the permission of the sender. ______________________________________________________________________

______________________________________________________________________ The information in this email is confidential and intended solely for the addressee. You must not disclose, forward, print or use it without the permission of the sender. ______________________________________________________________________ [[alternative HTML version deleted]]

ADD REPLY • link 9.9 years ago Matthew Ritchie ▴ 1000

0

Entering edit mode

aah I seee. let me do that and get back to you guys. Cheers! -Abhi On Thu, Jun 5, 2014 at 10:59 PM, Matt Ritchie <mritchie@wehi.edu.au> wrote: > Hi Abhi, > > The issue is likely to do with your small (n=3) samples size (less than 8 > and our new method, which uses k-means - > http://www.biomedcentral.com/1471-2105/15/158/abstract will not be > happy). We'll add an error message explaining this to the code. > > If adding a few mores samples doesn't help, perhaps you can provide us > with a subset of your idat files (offline) so that we can have a go at > reproducing this error. > > Cheers, > > Matt > ------------------------------ > *From: *"Abhishek Pratap" <apratap@sagebase.org> > *To: *"Matt Ritchie" <mritchie@wehi.edu.au> > *Cc: *bioconductor@r-project.org > *Sent: *Friday, 6 June, 2014 6:19:51 AM > > *Subject: *Re: [BioC] crlmm : copy number and genotyping of Illumina data > > Hey Matt > > > Here is the most recent error I got > > Instantiate CNSet container.Initializing container for genotyping and copy number estimationProcessing sample stratum 1 of 1Quantile normalizing 3 arrays, one at a time. |=============================== ================================================================| 100%Calibrating 3 arrays. |========================================== =====================================================| 100%Finished preprocessing.Preprocessing complete. Begin genotyping...Start computing log ratio -- Processing segment 1 out of 9 -- Processing segment 2 out of 9 -- Processing segment 3 out of 9 -- Processing segment 4 out of 9 -- Processing segment 5 out of 9 -- Processing segment 6 out of 9 -- Processing segment 7 out of 9 -- Processing segment 8 out of 9 -- Processing segment 9 out of 9Done computing log ratioleaving out novariant SNPsStart calculating Prior MeansError in checkForRemoteErrors(val) : > 8 nodes produced errors; first error: number of cluster centres must lie between 1 and nrow(x) > > > > On Thu, Jun 5, 2014 at 12:36 PM, Abhishek Pratap <apratap@sagebase.org> > wrote: > >> Here it is. hopefully the formatting is not that bad >> >> -A >> >> >> > sessionInfo()R version 3.1.0 (2014-04-10) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C LC_COLLATE=C >> [5] LC_MONETARY=C LC_MESSAGES=C LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C >> >> attached base packages: >> [1] tools grid parallel stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] synapseClient_1.2-1 affy_1.42.2 frma_1.16.0 >> [4] GEOquery_2.30.0 Biobase_2.24.0 RColorBrewer_1.0-5 >> [7] gplots_2.13.0 samr_2.0 matrixStats_0.8.14 >> [10] impute_1.38.1 ggplot2_1.0.0 VennDiagram_1.6.5 >> [13] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1 >> [16] GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.7 >> [19] BiocGenerics_0.10.0 illuminaio_0.6.0 gdata_2.13.3 >> [22] ff_2.2-13 bit_1.1-12 humanomni5quadv1bCrlmm_1.0.0 >> [25] crlmm_1.22.0 preprocessCore_1.26.1 oligoClasses_1.26.0 >> [28] BiocInstaller_1.14.2 >> >> loaded via a namespace (and not attached): >> [1] AnnotationDbi_1.26.0 Biostrings_2.32.0 DBI_0.2-7 KernSmooth_2.23-12 >> [5] MASS_7.3-33 Matrix_1.1-3 R.methodsS3_1.6.1 RCurl_1.95-4.1 >> [9] RJSONIO_1.2-0.2 RSQLite_0.11.4 RcppEigen_0.3.2.1.2 VGAM_0.9-4 >> [13] XML_3.98-1.1 XVector_0.4.0 affxparser_1.36.0 affyio_1.32.0 >> [17] annotate_1.42.0 base64_1.1 bitops_1.0-6 caTools_1.17 >> [21] codetools_0.2-8 colorspace_1.2-4 digest_0.6.4 ellipse_0.3-8 >> [25] foreach_1.4.2 genefilter_1.46.1 geneplotter_1.42.0 gtable_0.1.2 >> [29] gtools_3.4.1 iterators_1.0.7 lattice_0.20-29 locfit_1.5-9.1 >> [33] munsell_0.4.2 mvtnorm_0.9-99992 oligo_1.28.2 plyr_1.8.1 >> [37] proto_0.3-10 reshape2_1.4 scales_0.2.4 splines_3.1.0 >> [41] stats4_3.1.0 stringr_0.6.2 survival_2.37-7 xtable_1.7-3 >> [45] zlibbioc_1.10.0 >> >> >> >> On Tue, Jun 3, 2014 at 4:59 PM, Matt Ritchie <mritchie@wehi.edu.au> >> wrote: >> >>> Dear Abhi, >>> >>> Can you provide your sessionInfo()? >>> >>> Thanks, >>> >>> Matt >>> ------------------------------ >>> *From: *"Abhishek Pratap" <apratap@sagebase.org> >>> *To: *"Matt Ritchie" <mritchie@wehi.edu.au> >>> *Cc: *bioconductor@r-project.org >>> *Sent: *Tuesday, 3 June, 2014 1:36:20 PM >>> *Subject: *Re: [BioC] crlmm : copy number and genotyping of Illumina >>> data >>> >>> >>> Hi Matt >>> >>> As it happens this got on the back burner earlier and I am getting back >>> to it now. I am getting similar errors trying to do the CNV analysis on >>> Illumina Omni5 idat files. I would be more than happy to share subset of >>> data with you. Just wondering if you would have time now to help with this. >>> >>> >>> Depending on the subset of idat files I choose I am getting different >>> errors. >>> >>> Example: >>> >>> #1 >>> >>> leaving out novariant SNPsStart calculating Prior MeansError in calculatePriorValues(M, numSNP, verbose) : >>> could not find function "makeCluster" >>> >>> >>> #2 >>> >>> Instantiate CNSet container.Initializing container for genotyping and copy number estimationProcessing sample stratum 1 of 1Quantile normalizing 12 arrays, one at a time. |============================== ============================================| 100%Calibrating 12 arrays. |================== | 25%Error in chol.default(crossprod(sweep(matS, 1, z[, 1], FUN = "*"), matS)) : >>> the leading minor of order 1 is not positive definiteIn addition: Warning messages:1: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : >>> Chips are not of the same type. Skipping 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : >>> Chips are not of the same type. Skipping 6440_8116121004_R04C01_Grn.idat and 6440_8116121004_R04C01_Red.idat3: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : >>> Chips are not of the same type. Skipping 6449_8116121005_R01C01_Grn.idat and 6449_8116121005_R01C01_Red.idat >>> >>> >>> Thanks! >>> -Abhi >>> >>> >>> On Tue, Feb 18, 2014 at 8:28 PM, Matt Ritchie <mritchie@wehi.edu.au> >>> wrote: >>> >>>> Dear Abhi, >>>> >>>> You're using the appropriate function. Judging by the warning message, >>>> it looks like one of your samples is not like the others (i.e. sample >>>> 6431_8116121004 is not Omni5 version 1b). >>>> >>>> Maybe try leaving that one out and re-running? If that doesn't help, >>>> perhaps you could put the idat files for this test set of 5 arrays online >>>> so that I can take a closer look. >>>> >>>> Best wishes, >>>> >>>> Matt >>>> >>>> ----- Original Message ----- >>>> From: "Abhishek Pratap" <apratap@sagebase.org> >>>> To: bioconductor@r-project.org >>>> Sent: Tuesday, 18 February, 2014 11:01:38 AM >>>> Subject: [BioC] crlmm : copy number and genotyping of Illumina data >>>> >>>> Hi All >>>> >>>> I am trying to use crlmm package for doing the genotyping and CNV >>>> analysis on a set ~200 samples genotyped on Illumina Omni5 array. >>>> >>>> >>>> I tried following the vignette (seems a bit dated) and got some >>>> errors(see below) >>>> >>>> http://www.bioconductor.org/packages/release/bioc/vignettes/crlmm /inst/doc/IlluminaPreprocessCN.pdf >>>> >>>> Also searching a bit more I found multiple functions in the code of >>>> crlmm like (genotype.Illumina, crlmmIlluminav2 etc) which seem to be >>>> doing similar stuff. >>>> >>>> Just wondering if someone can point me to the latest recipe(if any) of >>>> reading in the idat files (dual channel) and do the basic genotyping >>>> calling + copy number analysis. >>>> >>>> >>>> here is what I have done for a test case (5 arrays) >>>> >>>> >>>> > cnSet <- genotype.Illumina(sampleSheet = samplesheet[1:5,], >>>> + arrayNames = arrayNames[1:5], >>>> + arrayInfoColNames = >>>> list(barcode="SentrixBarcode_A", position="SentrixPosition_A"), >>>> + path = datadir, >>>> + copynumber = T, >>>> + batch = samplesheet$Sample_Group[1:5], >>>> + cdfName = "humanomni5quadv1b", >>>> + call.method = "krlmm", >>>> + verbose=T >>>> + ) >>>> Instantiate CNSet container. >>>> Initializing container for genotyping and copy number estimation >>>> reading >>>> /work/DAT_118__AML/Analysis/dset1/CNV/data/5396_6298080101_R02C01 _Grn.idat >>>> reading >>>> /work/DAT_118__AML/Analysis/dset1/CNV/data/5405_6298080103_R03C01 _Grn.idat >>>> reading >>>> /work/DAT_118__AML/Analysis/dset1/CNV/data/5414_6298098003_R04C01 _Grn.idat >>>> reading >>>> /work/DAT_118__AML/Analysis/dset1/CNV/data/6431_8116121004_R03C01 _Grn.idat >>>> reading >>>> /work/DAT_118__AML/Analysis/dset1/CNV/data/5423_6762372017_R01C01 _Grn.idat >>>> Processing sample stratum 1 of 1 >>>> >>>> Loading chip annotation information. >>>> Loading reference normalization information. >>>> Quantile normalizing 5 arrays, one at a time. >>>> >>>> |================================================================ ===============================| >>>> 100% >>>> Loading snp annotation and mixture model parameters. >>>> Calibrating 5 arrays. >>>> |========================================================= >>>> | 60% >>>> Error in quantile.default(M, c(1, 5)/6, names = FALSE) : >>>> missing values and NaN's not allowed if 'na.rm' is FALSE >>>> >>>> In addition: Warning messages: >>>> 1: In getProtocolData.Illumina(grnidats, sep = sep, fileExt = >>>> fileExt$green, : >>>> Chips are not of the same type. Skipping >>>> 6431_8116121004_R03C01_Grn.idat >>>> 2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = >>>> arrayNames[sel], : >>>> Chips are not of the same type. Skipping >>>> 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat >>>> >>>> >>>> >>>> >>>> Thanks! >>>> -Abhi >>>> >>>> ______________________________________________________________________ >>>> The information in this email is confidential and intended solely for >>>> the addressee. >>>> You must not disclose, forward, print or use it without the permission >>>> of the sender. >>>> ______________________________________________________________________ >>>> >>> >>> >>> >>> ______________________________________________________________________ >>> The information in this email is confidential and intended solely for >>> the addressee. >>> You must not disclose, forward, print or use it without the permission >>> of the sender. >>> ______________________________________________________________________ >>> >> >> > > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:10}}

ADD REPLY • link 9.9 years ago Abhishek Pratap ▴ 160

0

Entering edit mode

Hi Matt congrats on the new publication.. Unfortunately I am not able to run using 10, 20 arrays. The errors this time are more consistent. Let me upload a subset of this data and send the information to you offline. Cheers! -Abhi Instantiate CNSet container.Initializing container for genotyping and copy number estimationProcessing sample stratum 1 of 1Loading chip annotation information.Loading reference normalization information.Quantile normalizing 11 arrays, one at a time. |===================================================================== ======| 100%Loading snp annotation and mixture model parameters.Calibrating 11 arrays. |==================== | 27%Error in chol.default(crossprod(sweep(matS, 1, z[, 1], FUN = "*"), matS)) : the leading minor of order 1 is not positive definiteIn addition: Warning messages:1: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6755_8116121057_R03C01_Grn.idat and 6755_8116121057_R03C01_Red.idat2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6467_8116121013_R03C01_Grn.idat and 6467_8116121013_R03C01_Red.idat3: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6476_8116121009_R02C01_Grn.idat and 6476_8116121009_R02C01_Red.idat4: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : Chips are not of the same type. Skipping 6485_8116121009_R01C01_Grn.idat and 6485_8116121009_R01C01_Red.idat On Fri, Jun 6, 2014 at 12:10 PM, Abhishek Pratap <apratap@sagebase.org> wrote: > aah I seee. let me do that and get back to you guys. > > Cheers! > -Abhi > > > On Thu, Jun 5, 2014 at 10:59 PM, Matt Ritchie <mritchie@wehi.edu.au> > wrote: > >> Hi Abhi, >> >> The issue is likely to do with your small (n=3) samples size (less than 8 >> and our new method, which uses k-means - >> http://www.biomedcentral.com/1471-2105/15/158/abstract will not be >> happy). We'll add an error message explaining this to the code. >> >> If adding a few mores samples doesn't help, perhaps you can provide us >> with a subset of your idat files (offline) so that we can have a go at >> reproducing this error. >> >> Cheers, >> >> Matt >> ------------------------------ >> *From: *"Abhishek Pratap" <apratap@sagebase.org> >> *To: *"Matt Ritchie" <mritchie@wehi.edu.au> >> *Cc: *bioconductor@r-project.org >> *Sent: *Friday, 6 June, 2014 6:19:51 AM >> >> *Subject: *Re: [BioC] crlmm : copy number and genotyping of Illumina data >> >> Hey Matt >> >> >> Here is the most recent error I got >> >> Instantiate CNSet container.Initializing container for genotyping and copy number estimationProcessing sample stratum 1 of 1Quantile normalizing 3 arrays, one at a time. |=============================== ================================================================| 100%Calibrating 3 arrays. |========================================== =====================================================| 100%Finished preprocessing.Preprocessing complete. Begin genotyping...Start computing log ratio -- Processing segment 1 out of 9 -- Processing segment 2 out of 9 -- Processing segment 3 out of 9 -- Processing segment 4 out of 9 -- Processing segment 5 out of 9 -- Processing segment 6 out of 9 -- Processing segment 7 out of 9 -- Processing segment 8 out of 9 -- Processing segment 9 out of 9Done computing log ratioleaving out novariant SNPsStart calculating Prior MeansError in checkForRemoteErrors(val) : >> 8 nodes produced errors; first error: number of cluster centres must lie between 1 and nrow(x) >> >> >> >> On Thu, Jun 5, 2014 at 12:36 PM, Abhishek Pratap <apratap@sagebase.org> >> wrote: >> >>> Here it is. hopefully the formatting is not that bad >>> >>> -A >>> >>> >>> > sessionInfo()R version 3.1.0 (2014-04-10) >>> Platform: x86_64-unknown-linux-gnu (64-bit) >>> >>> locale: >>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C LC_COLLATE=C >>> [5] LC_MONETARY=C LC_MESSAGES=C LC_PAPER=C LC_NAME=C >>> [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C >>> >>> attached base packages: >>> [1] tools grid parallel stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] synapseClient_1.2-1 affy_1.42.2 frma_1.16.0 >>> [4] GEOquery_2.30.0 Biobase_2.24.0 RColorBrewer_1.0-5 >>> [7] gplots_2.13.0 samr_2.0 matrixStats_0.8.14 >>> [10] impute_1.38.1 ggplot2_1.0.0 VennDiagram_1.6.5 >>> [13] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1 >>> [16] GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.7 >>> [19] BiocGenerics_0.10.0 illuminaio_0.6.0 gdata_2.13.3 >>> [22] ff_2.2-13 bit_1.1-12 humanomni5quadv1bCrlmm_1.0.0 >>> [25] crlmm_1.22.0 preprocessCore_1.26.1 oligoClasses_1.26.0 >>> [28] BiocInstaller_1.14.2 >>> >>> loaded via a namespace (and not attached): >>> [1] AnnotationDbi_1.26.0 Biostrings_2.32.0 DBI_0.2-7 KernSmooth_2.23-12 >>> [5] MASS_7.3-33 Matrix_1.1-3 R.methodsS3_1.6.1 RCurl_1.95-4.1 >>> [9] RJSONIO_1.2-0.2 RSQLite_0.11.4 RcppEigen_0.3.2.1.2 VGAM_0.9-4 >>> [13] XML_3.98-1.1 XVector_0.4.0 affxparser_1.36.0 affyio_1.32.0 >>> [17] annotate_1.42.0 base64_1.1 bitops_1.0-6 caTools_1.17 >>> [21] codetools_0.2-8 colorspace_1.2-4 digest_0.6.4 ellipse_0.3-8 >>> [25] foreach_1.4.2 genefilter_1.46.1 geneplotter_1.42.0 gtable_0.1.2 >>> [29] gtools_3.4.1 iterators_1.0.7 lattice_0.20-29 locfit_1.5-9.1 >>> [33] munsell_0.4.2 mvtnorm_0.9-99992 oligo_1.28.2 plyr_1.8.1 >>> [37] proto_0.3-10 reshape2_1.4 scales_0.2.4 splines_3.1.0 >>> [41] stats4_3.1.0 stringr_0.6.2 survival_2.37-7 xtable_1.7-3 >>> [45] zlibbioc_1.10.0 >>> >>> >>> >>> On Tue, Jun 3, 2014 at 4:59 PM, Matt Ritchie <mritchie@wehi.edu.au> >>> wrote: >>> >>>> Dear Abhi, >>>> >>>> Can you provide your sessionInfo()? >>>> >>>> Thanks, >>>> >>>> Matt >>>> ------------------------------ >>>> *From: *"Abhishek Pratap" <apratap@sagebase.org> >>>> *To: *"Matt Ritchie" <mritchie@wehi.edu.au> >>>> *Cc: *bioconductor@r-project.org >>>> *Sent: *Tuesday, 3 June, 2014 1:36:20 PM >>>> *Subject: *Re: [BioC] crlmm : copy number and genotyping of Illumina >>>> data >>>> >>>> >>>> Hi Matt >>>> >>>> As it happens this got on the back burner earlier and I am getting back >>>> to it now. I am getting similar errors trying to do the CNV analysis on >>>> Illumina Omni5 idat files. I would be more than happy to share subset of >>>> data with you. Just wondering if you would have time now to help with this. >>>> >>>> >>>> Depending on the subset of idat files I choose I am getting different >>>> errors. >>>> >>>> Example: >>>> >>>> #1 >>>> >>>> leaving out novariant SNPsStart calculating Prior MeansError in calculatePriorValues(M, numSNP, verbose) : >>>> could not find function "makeCluster" >>>> >>>> >>>> #2 >>>> >>>> Instantiate CNSet container.Initializing container for genotyping and copy number estimationProcessing sample stratum 1 of 1Quantile normalizing 12 arrays, one at a time. |============================== ============================================| 100%Calibrating 12 arrays. |================== | 25%Error in chol.default(crossprod(sweep(matS, 1, z[, 1], FUN = "*"), matS)) : >>>> the leading minor of order 1 is not positive definiteIn addition: Warning messages:1: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : >>>> Chips are not of the same type. Skipping 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : >>>> Chips are not of the same type. Skipping 6440_8116121004_R04C01_Grn.idat and 6440_8116121004_R04C01_Red.idat3: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = arrayNames[sel], : >>>> Chips are not of the same type. Skipping 6449_8116121005_R01C01_Grn.idat and 6449_8116121005_R01C01_Red.idat >>>> >>>> >>>> Thanks! >>>> -Abhi >>>> >>>> >>>> On Tue, Feb 18, 2014 at 8:28 PM, Matt Ritchie <mritchie@wehi.edu.au> >>>> wrote: >>>> >>>>> Dear Abhi, >>>>> >>>>> You're using the appropriate function. Judging by the warning >>>>> message, it looks like one of your samples is not like the others (i.e. >>>>> sample 6431_8116121004 is not Omni5 version 1b). >>>>> >>>>> Maybe try leaving that one out and re-running? If that doesn't help, >>>>> perhaps you could put the idat files for this test set of 5 arrays online >>>>> so that I can take a closer look. >>>>> >>>>> Best wishes, >>>>> >>>>> Matt >>>>> >>>>> ----- Original Message ----- >>>>> From: "Abhishek Pratap" <apratap@sagebase.org> >>>>> To: bioconductor@r-project.org >>>>> Sent: Tuesday, 18 February, 2014 11:01:38 AM >>>>> Subject: [BioC] crlmm : copy number and genotyping of Illumina data >>>>> >>>>> Hi All >>>>> >>>>> I am trying to use crlmm package for doing the genotyping and CNV >>>>> analysis on a set ~200 samples genotyped on Illumina Omni5 array. >>>>> >>>>> >>>>> I tried following the vignette (seems a bit dated) and got some >>>>> errors(see below) >>>>> >>>>> http://www.bioconductor.org/packages/release/bioc/vignettes/crlm m/inst/doc/IlluminaPreprocessCN.pdf >>>>> >>>>> Also searching a bit more I found multiple functions in the code of >>>>> crlmm like (genotype.Illumina, crlmmIlluminav2 etc) which seem to be >>>>> doing similar stuff. >>>>> >>>>> Just wondering if someone can point me to the latest recipe(if any) of >>>>> reading in the idat files (dual channel) and do the basic genotyping >>>>> calling + copy number analysis. >>>>> >>>>> >>>>> here is what I have done for a test case (5 arrays) >>>>> >>>>> >>>>> > cnSet <- genotype.Illumina(sampleSheet = samplesheet[1:5,], >>>>> + arrayNames = arrayNames[1:5], >>>>> + arrayInfoColNames = >>>>> list(barcode="SentrixBarcode_A", position="SentrixPosition_A"), >>>>> + path = datadir, >>>>> + copynumber = T, >>>>> + batch = samplesheet$Sample_Group[1:5], >>>>> + cdfName = "humanomni5quadv1b", >>>>> + call.method = "krlmm", >>>>> + verbose=T >>>>> + ) >>>>> Instantiate CNSet container. >>>>> Initializing container for genotyping and copy number estimation >>>>> reading >>>>> /work/DAT_118__AML/Analysis/dset1/CNV/data/5396_6298080101_R02C0 1_Grn.idat >>>>> reading >>>>> /work/DAT_118__AML/Analysis/dset1/CNV/data/5405_6298080103_R03C0 1_Grn.idat >>>>> reading >>>>> /work/DAT_118__AML/Analysis/dset1/CNV/data/5414_6298098003_R04C0 1_Grn.idat >>>>> reading >>>>> /work/DAT_118__AML/Analysis/dset1/CNV/data/6431_8116121004_R03C0 1_Grn.idat >>>>> reading >>>>> /work/DAT_118__AML/Analysis/dset1/CNV/data/5423_6762372017_R01C0 1_Grn.idat >>>>> Processing sample stratum 1 of 1 >>>>> >>>>> Loading chip annotation information. >>>>> Loading reference normalization information. >>>>> Quantile normalizing 5 arrays, one at a time. >>>>> >>>>> |=============================================================== ================================| >>>>> 100% >>>>> Loading snp annotation and mixture model parameters. >>>>> Calibrating 5 arrays. >>>>> |========================================================= >>>>> | 60% >>>>> Error in quantile.default(M, c(1, 5)/6, names = FALSE) : >>>>> missing values and NaN's not allowed if 'na.rm' is FALSE >>>>> >>>>> In addition: Warning messages: >>>>> 1: In getProtocolData.Illumina(grnidats, sep = sep, fileExt = >>>>> fileExt$green, : >>>>> Chips are not of the same type. Skipping >>>>> 6431_8116121004_R03C01_Grn.idat >>>>> 2: In readIdatFiles(sampleSheet = sampleSheet[sel, ], arrayNames = >>>>> arrayNames[sel], : >>>>> Chips are not of the same type. Skipping >>>>> 6431_8116121004_R03C01_Grn.idat and 6431_8116121004_R03C01_Red.idat >>>>> >>>>> >>>>> >>>>> >>>>> Thanks! >>>>> -Abhi >>>>> >>>>> ______________________________________________________________________ >>>>> The information in this email is confidential and intended solely for >>>>> the addressee. >>>>> You must not disclose, forward, print or use it without the permission >>>>> of the sender. >>>>> ______________________________________________________________________ >>>>> >>>> >>>> >>>> >>>> ______________________________________________________________________ >>>> The information in this email is confidential and intended solely for >>>> the addressee. >>>> You must not disclose, forward, print or use it without the permission >>>> of the sender. >>>> ______________________________________________________________________ >>>> >>> >>> >> >> >> ______________________________________________________________________ >> The information in this email is confidential and intended solely for the >> addressee. >> You must not disclose, forward, print or use it without the permission of >> the sender. >> ______________________________________________________________________ >> > > [[alternative HTML version deleted]]

ADD REPLY • link 9.9 years ago Abhishek Pratap ▴ 160