crlmm and cluster centres error
1
0
Entering edit mode
@ricardo-vidal-4261
Last seen 9.6 years ago
Hi, As I get acquainted with R and Bioconductor, I've been trying to look at some SNP6 microarrays using crlmm and have been running in to many speed bumps. I'm trying to look at two CEL files (samples) and considering it is a small number of samples, I assumed the bare basic example from the manual would be sufficient but I run into the following problem: "Error: number of cluster centres must lie between 1 and nrow(x)" Where would I define the cluster centres? Is this an issue with the CEL files? Any help is well appreciated. Best, Ricardo > library(crlmm) > path <- "input/dna/" > require(oligoClasses) > library(hapmapsnp6) > celFiles <- list.celfiles(path, full.names=TRUE) > system.time(clrmmResult <- crlmm(celFiles, verbose = FALSE)) Welcome togenomewidesnp6Crlmmversion 1.0.2 Error: number of cluster centres must lie between 1 and nrow(x) Timing stopped at: 62.81 1.45 64.25 > sessionInfo() R version 2.11.1 (2010-05-31) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] genomewidesnp6Crlmm_1.0.2 hapmapsnp6_1.3.3 [3] ff_2.1-2 bit_1.1-4 [5] crlmm_1.6.5 oligoClasses_1.10.0 [7] Biobase_2.8.0 loaded via a namespace (and not attached): [1] affyio_1.16.0 annotate_1.26.1 AnnotationDbi_1.10.2 [4] Biostrings_2.16.9 DBI_0.2-5 ellipse_0.3-5 [7] genefilter_1.30.0 IRanges_1.6.17 mvtnorm_0.9-92 [10] preprocessCore_1.10.0 RSQLite_0.9-2 splines_2.11.1 [13] survival_2.35-8 xtable_1.5-6 >
crlmm crlmm • 1.6k views
ADD COMMENT
0
Entering edit mode
@benilton-carvalho-1375
Last seen 4.1 years ago
Brazil/Campinas/UNICAMP
Hi Ricardo, using the sample dataset in 'hapmapsnp6' works just fine for me. This is a tiny set with 3 samples. Even if I try it with 1 sample, the software behaves as expected... It is possible this is an issue with the CEL files... b > system.time(clrmmResult <- crlmm(celFiles[1], verbose = FALSE)) user system elapsed 88.439 1.300 89.744 Warning message: In crlmmGT(res[["A"]], res[["B"]], res[["SNR"]], res[["mixtureParams"]], : Recalibration not possible. Possible cause: small sample size. > sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] genomewidesnp6Crlmm_1.0.2 hapmapsnp6_1.3.3 [3] crlmm_1.6.5 oligoClasses_1.10.0 [5] Biobase_2.8.0 loaded via a namespace (and not attached): [1] affyio_1.16.0 annotate_1.26.1 AnnotationDbi_1.10.2 [4] Biostrings_2.16.9 bit_1.1-4 DBI_0.2-5 [7] ellipse_0.3-5 ff_2.1-2 genefilter_1.30.0 [10] IRanges_1.6.17 mvtnorm_0.9-92 preprocessCore_1.10.0 [13] RSQLite_0.9-2 splines_2.11.1 survival_2.35-8 [16] tools_2.11.1 xtable_1.5-6 > On 14 September 2010 19:55, Ricardo Vidal <rvidal at="" gmail.com=""> wrote: > Hi, > > As I get acquainted with R and Bioconductor, I've been trying to look at some SNP6 microarrays using crlmm and have been running in to many speed bumps. > > I'm trying to look at two CEL files (samples) and considering it is a small number of samples, I assumed the bare basic example from the manual would be sufficient but I run into the following problem: > > "Error: number of cluster centres must lie between 1 and nrow(x)" > > Where would I define the cluster centres? Is this an issue with the CEL files? > > Any help is well appreciated. > > Best, > Ricardo > > >> library(crlmm) >> path <- "input/dna/" >> require(oligoClasses) >> library(hapmapsnp6) >> celFiles <- list.celfiles(path, full.names=TRUE) >> system.time(clrmmResult <- crlmm(celFiles, verbose = FALSE)) > Welcome togenomewidesnp6Crlmmversion 1.0.2 > Error: number of cluster centres must lie between 1 and nrow(x) > Timing stopped at: 62.81 1.45 64.25 > >> sessionInfo() > R version 2.11.1 (2010-05-31) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] tools ? ? stats ? ? graphics ?grDevices utils ? ? datasets ?methods > [8] base > > other attached packages: > [1] genomewidesnp6Crlmm_1.0.2 hapmapsnp6_1.3.3 > [3] ff_2.1-2 ? ? ? ? ? ? ? ? ?bit_1.1-4 > [5] crlmm_1.6.5 ? ? ? ? ? ? ? oligoClasses_1.10.0 > [7] Biobase_2.8.0 > > loaded via a namespace (and not attached): > ?[1] affyio_1.16.0 ? ? ? ? annotate_1.26.1 ? ? ? AnnotationDbi_1.10.2 > ?[4] Biostrings_2.16.9 ? ? DBI_0.2-5 ? ? ? ? ? ? ellipse_0.3-5 > ?[7] genefilter_1.30.0 ? ? IRanges_1.6.17 ? ? ? ?mvtnorm_0.9-92 > [10] preprocessCore_1.10.0 RSQLite_0.9-2 ? ? ? ? splines_2.11.1 > [13] survival_2.35-8 ? ? ? xtable_1.5-6 >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
With the sample data it seems to work here too. However, with my 2 CEL files, not so much. I've only managed to get anywhere with my CEL files if I play around with other parameters like setting gender=FALSE and setting batch=c("resistant", "sensitive") which doesn't make sense to me but works as a vector with the same length as the sample size (ie. length(celFiles) ). Just following the example, it doesn't seem to work. I tried genotype this time round and only with the changes I mentioned above did I manage to get anywhere... I'll try with other CEL files and see if I get different results. Thanks, Ricardo On 2010-09-14, at 3:41 PM, Benilton Carvalho wrote: > Hi Ricardo, > > using the sample dataset in 'hapmapsnp6' works just fine for me. This > is a tiny set with 3 samples. Even if I try it with 1 sample, the > software behaves as expected... > > It is possible this is an issue with the CEL files... > > b > >> system.time(clrmmResult <- crlmm(celFiles[1], verbose = FALSE)) > user system elapsed > 88.439 1.300 89.744 > Warning message: > In crlmmGT(res[["A"]], res[["B"]], res[["SNR"]], res[["mixtureParams"]], : > Recalibration not possible. Possible cause: small sample size. >> sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-apple-darwin9.8.0 > > locale: > [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 > > attached base packages: > [1] stats graphics grDevices datasets utils methods base > > other attached packages: > [1] genomewidesnp6Crlmm_1.0.2 hapmapsnp6_1.3.3 > [3] crlmm_1.6.5 oligoClasses_1.10.0 > [5] Biobase_2.8.0 > > loaded via a namespace (and not attached): > [1] affyio_1.16.0 annotate_1.26.1 AnnotationDbi_1.10.2 > [4] Biostrings_2.16.9 bit_1.1-4 DBI_0.2-5 > [7] ellipse_0.3-5 ff_2.1-2 genefilter_1.30.0 > [10] IRanges_1.6.17 mvtnorm_0.9-92 preprocessCore_1.10.0 > [13] RSQLite_0.9-2 splines_2.11.1 survival_2.35-8 > [16] tools_2.11.1 xtable_1.5-6 >> > > > On 14 September 2010 19:55, Ricardo Vidal <rvidal at="" gmail.com=""> wrote: >> Hi, >> >> As I get acquainted with R and Bioconductor, I've been trying to look at some SNP6 microarrays using crlmm and have been running in to many speed bumps. >> >> I'm trying to look at two CEL files (samples) and considering it is a small number of samples, I assumed the bare basic example from the manual would be sufficient but I run into the following problem: >> >> "Error: number of cluster centres must lie between 1 and nrow(x)" >> >> Where would I define the cluster centres? Is this an issue with the CEL files? >> >> Any help is well appreciated. >> >> Best, >> Ricardo >> >> >>> library(crlmm) >>> path <- "input/dna/" >>> require(oligoClasses) >>> library(hapmapsnp6) >>> celFiles <- list.celfiles(path, full.names=TRUE) >>> system.time(clrmmResult <- crlmm(celFiles, verbose = FALSE)) >> Welcome togenomewidesnp6Crlmmversion 1.0.2 >> Error: number of cluster centres must lie between 1 and nrow(x) >> Timing stopped at: 62.81 1.45 64.25 >> >>> sessionInfo() >> R version 2.11.1 (2010-05-31) >> i386-pc-mingw32 >> >> locale: >> [1] LC_COLLATE=English_United States.1252 >> [2] LC_CTYPE=English_United States.1252 >> [3] LC_MONETARY=English_United States.1252 >> [4] LC_NUMERIC=C >> [5] LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] tools stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] genomewidesnp6Crlmm_1.0.2 hapmapsnp6_1.3.3 >> [3] ff_2.1-2 bit_1.1-4 >> [5] crlmm_1.6.5 oligoClasses_1.10.0 >> [7] Biobase_2.8.0 >> >> loaded via a namespace (and not attached): >> [1] affyio_1.16.0 annotate_1.26.1 AnnotationDbi_1.10.2 >> [4] Biostrings_2.16.9 DBI_0.2-5 ellipse_0.3-5 >> [7] genefilter_1.30.0 IRanges_1.6.17 mvtnorm_0.9-92 >> [10] preprocessCore_1.10.0 RSQLite_0.9-2 splines_2.11.1 >> [13] survival_2.35-8 xtable_1.5-6 >>> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>
ADD REPLY
0
Entering edit mode
I've just tried the same thing with two different CEL files and getting the same problem. All files are Affymetrix SNP6. Only difference I can see is the celFiles names are relative to the current work directory (getwd()) and not the full system path like the example in the documentation. The details again below... Any help would be highly appreciated. > library(crlmm) > path <- "input/dna/" > ??system.file > celFiles <- list.celfiles(path, full.names=TRUE) > celFiles [1] "input/dna/GW6_022610H_JS1_1304-S.CEL" [2] "input/dna/GW6_022610H_JS5_1776-R.CEL" > system.time(crlmmResult <- crlmm(filenames=celFiles, verbose=FALSE)) Welcome togenomewidesnp6Crlmmversion 1.0.2 Error: number of cluster centres must lie between 1 and nrow(x) Timing stopped at: 60.79 1.57 62.37 > sessionInfo() R version 2.11.1 (2010-05-31) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] genomewidesnp6Crlmm_1.0.2 crlmm_1.6.5 oligoClasses_1.10.0 Biobase_2.8.0 loaded via a namespace (and not attached): [1] affyio_1.16.0 annotate_1.26.1 AnnotationDbi_1.10.2 Biostrings_2.16.9 bit_1.1-4 [6] DBI_0.2-5 ellipse_0.3-5 ff_2.1-2 genefilter_1.30.0 IRanges_1.6.17 [11] mvtnorm_0.9-92 preprocessCore_1.10.0 RSQLite_0.9-2 splines_2.11.1 survival_2.35-8 [16] tools_2.11.1 xtable_1.5-6 > Kindly, Ricardo On 2010-09-14, at 3:48 PM, Ricardo Vidal wrote: > With the sample data it seems to work here too. However, with my 2 CEL files, not so much. > > I've only managed to get anywhere with my CEL files if I play around with other parameters like setting gender=FALSE and setting batch=c("resistant", "sensitive") which doesn't make sense to me but works as a vector with the same length as the sample size (ie. length(celFiles) ). > > Just following the example, it doesn't seem to work. I tried genotype this time round and only with the changes I mentioned above did I manage to get anywhere... > > I'll try with other CEL files and see if I get different results. > > > Thanks, > Ricardo > > On 2010-09-14, at 3:41 PM, Benilton Carvalho wrote: > >> Hi Ricardo, >> >> using the sample dataset in 'hapmapsnp6' works just fine for me. This >> is a tiny set with 3 samples. Even if I try it with 1 sample, the >> software behaves as expected... >> >> It is possible this is an issue with the CEL files... >> >> b >> >>> system.time(clrmmResult <- crlmm(celFiles[1], verbose = FALSE)) >> user system elapsed >> 88.439 1.300 89.744 >> Warning message: >> In crlmmGT(res[["A"]], res[["B"]], res[["SNR"]], res[["mixtureParams"]], : >> Recalibration not possible. Possible cause: small sample size. >>> sessionInfo() >> R version 2.11.1 (2010-05-31) >> x86_64-apple-darwin9.8.0 >> >> locale: >> [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices datasets utils methods base >> >> other attached packages: >> [1] genomewidesnp6Crlmm_1.0.2 hapmapsnp6_1.3.3 >> [3] crlmm_1.6.5 oligoClasses_1.10.0 >> [5] Biobase_2.8.0 >> >> loaded via a namespace (and not attached): >> [1] affyio_1.16.0 annotate_1.26.1 AnnotationDbi_1.10.2 >> [4] Biostrings_2.16.9 bit_1.1-4 DBI_0.2-5 >> [7] ellipse_0.3-5 ff_2.1-2 genefilter_1.30.0 >> [10] IRanges_1.6.17 mvtnorm_0.9-92 preprocessCore_1.10.0 >> [13] RSQLite_0.9-2 splines_2.11.1 survival_2.35-8 >> [16] tools_2.11.1 xtable_1.5-6 >>> >> >> >> On 14 September 2010 19:55, Ricardo Vidal <rvidal at="" gmail.com=""> wrote: >>> Hi, >>> >>> As I get acquainted with R and Bioconductor, I've been trying to look at some SNP6 microarrays using crlmm and have been running in to many speed bumps. >>> >>> I'm trying to look at two CEL files (samples) and considering it is a small number of samples, I assumed the bare basic example from the manual would be sufficient but I run into the following problem: >>> >>> "Error: number of cluster centres must lie between 1 and nrow(x)" >>> >>> Where would I define the cluster centres? Is this an issue with the CEL files? >>> >>> Any help is well appreciated. >>> >>> Best, >>> Ricardo >>> >>> >>>> library(crlmm) >>>> path <- "input/dna/" >>>> require(oligoClasses) >>>> library(hapmapsnp6) >>>> celFiles <- list.celfiles(path, full.names=TRUE) >>>> system.time(clrmmResult <- crlmm(celFiles, verbose = FALSE)) >>> Welcome togenomewidesnp6Crlmmversion 1.0.2 >>> Error: number of cluster centres must lie between 1 and nrow(x) >>> Timing stopped at: 62.81 1.45 64.25 >>> >>>> sessionInfo() >>> R version 2.11.1 (2010-05-31) >>> i386-pc-mingw32 >>> >>> locale: >>> [1] LC_COLLATE=English_United States.1252 >>> [2] LC_CTYPE=English_United States.1252 >>> [3] LC_MONETARY=English_United States.1252 >>> [4] LC_NUMERIC=C >>> [5] LC_TIME=English_United States.1252 >>> >>> attached base packages: >>> [1] tools stats graphics grDevices utils datasets methods >>> [8] base >>> >>> other attached packages: >>> [1] genomewidesnp6Crlmm_1.0.2 hapmapsnp6_1.3.3 >>> [3] ff_2.1-2 bit_1.1-4 >>> [5] crlmm_1.6.5 oligoClasses_1.10.0 >>> [7] Biobase_2.8.0 >>> >>> loaded via a namespace (and not attached): >>> [1] affyio_1.16.0 annotate_1.26.1 AnnotationDbi_1.10.2 >>> [4] Biostrings_2.16.9 DBI_0.2-5 ellipse_0.3-5 >>> [7] genefilter_1.30.0 IRanges_1.6.17 mvtnorm_0.9-92 >>> [10] preprocessCore_1.10.0 RSQLite_0.9-2 splines_2.11.1 >>> [13] survival_2.35-8 xtable_1.5-6 >>>> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >
ADD REPLY

Login before adding your answer.

Traffic: 762 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6