Dear Benilton,
Thanks for your email.
First, I would like to thank you for designing/implementing the great
CRLMM algorithm/package.
It provides us a better results of genotyping and benefits further
downstream analyses.
Yes, I did find the source of the problem after I read the source
codes of CRLMM() and getCrlmmSummries() as well ran the debugging mode
of R. not an easy task though...... :)
The problem comes up when an individual celfile name is not single
word, i.e. the file name contains white spaces.
getCrlmmSummries() calls readSummaries() to gather summary stats for
alleleA and alleleB. In readSummaries(),
the column names "Colnames" are gathered by func. read.table() from
the CRLMM resulting file "crlmm-calls.txt". If there are white spaces
existing in any celfile's name, read.table() would in default chop the
file name by white space (sep="") which generates many
redundant/incorrect columns. Therefore the length of tmp[[2]] is
shorter than the length of the output of read.table(), and R returns
this error to the screen.
I believe a possible way to deal with this is simply replacing
read.table() with read.delim()since the default separator for
read.delim() is "\t", which is not often to see in file names. Or
making an note on the CRLMM vignette is another easy way to address
this issue.
Thanks,
Ping-Hsun Hsieh
-----Original Message-----
From: Benilton Carvalho [mailto:bcarvalh@jhsph.edu]
Sent: Wednesday, March 04, 2009 8:00 AM
To: Ping-Hsun Hsieh
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] (CRLMM) an issue needs to be clarified
Dear PinnHsun,
Unfortunately, I cannot reproduce the problem you report.
I do need to upgrade to 2.8.1, but I hardly believe this is the source
of the problem.
I genotyped 9 SNP 6.0 samples and ran getCrlmmSummaries(), and this is
what I've got (below).
Did you have any success in the meantime?
benilton
--
> y = getCrlmmSummaries("test")
> sessionInfo()
R version 2.8.0 (2008-10-20)
x86_64-unknown-linux-gnu
locale:
LC_CTYPE
=
en_US
.UTF
-8
;LC_NUMERIC
=
C
;LC_TIME
=
en_US
.UTF
-8
;LC_COLLATE
=
en_US
.UTF
-8
;LC_MONETARY
=
C
;LC_MESSAGES
=
en_US
.UTF
-8
;LC_PAPER
=
en_US
.UTF
-8
;LC_NAME
=
C
;LC_ADDRESS
=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] splines tools stats graphics grDevices utils
datasets
[8] methods base
other attached packages:
[1] pd.genomewidesnp.6_0.4.2 oligo_1.6.0
oligoClasses_1.5.1
[4] affxparser_1.14.1 AnnotationDbi_1.4.2
preprocessCore_1.4.0
[7] RSQLite_0.7-1 DBI_0.2-4 Biobase_2.2.1
> y
SnpCnvCallSetPlus (storageMode: lockedEnvironment)
assayData: 906600 features, 9 samples
element names: calls, callsConfidence, thetaA, thetaB
phenoData
sampleNames: NA06985_GW6_C.CEL, NA06991_GW6_C.CEL, ...,
NA07034_GW6_C.CEL (9 total)
varLabels and varMetadata description: none
featureData
featureNames: SNP_A-4270094, SNP_A-8282305, ..., SNP_A-8433021
(906600 total)
fvarLabels and fvarMetadata description: none
experimentData: use 'experimentData(object)'
Annotation: pd.genomewidesnp.6
>
b
On Mar 2, 2009, at 3:33 PM, Ping-Hsun Hsieh wrote:
> Dear all,
>
> I got the following error message when I was trying to use the
> function ?getCrlmmSummaries()? to retrieve results generated by
> running CRLMM genotyping algorithm successfully over 9 Affy SNP 6.0
> chips.
>
> ####################
>> outObj_crlmm<-getCrlmmSummaries(outDir)
> Error in dimnames(x) <- dn :
> length of 'dimnames' [2] not equal to array extent
>
> Enter a frame number, or 0 to exit
>
> 1: getCrlmmSummaries(outDir)
> 2: new("SnpCnvCallSetPlus", calls = readSummaries("calls", tmpdir),
> callsConfi
> 3: initialize(value, ...)
> 4: initialize(value, ...)
> 5: .local(.Object, ...)
> 6: assayDataNew("lockedEnvironment", calls = calls, callsConfidence
> = callsCon
> 7: readSummaries("alleleA", tmpdir)
> 8: `colnames<-`(`*tmp*`, value = c("MDSNP",
> "02_00004758_CN_080925.CEL", "MDSN
> #####################
>
> My system:
> Linux x86_64 with 16 GB memory.
>
>> sessionInfo()
> R version 2.8.1 (2008-12-22)
> x86_64-unknown-linux-gnu
>
> locale:
> LC_CTYPE
> =
> en_US
> .UTF
> -8
> ;LC_NUMERIC
> =
> C
> ;LC_TIME
> =
> en_US
> .UTF
> -8
> ;LC_COLLATE
> =
> en_US
> .UTF
> -8
> ;LC_MONETARY
> =
> C
> ;LC_MESSAGES
> =
> en_US
> .UTF
> -8
> ;LC_PAPER
> =
> en_US
> .UTF
> -8
> ;LC_NAME
> =
> C
> ;LC_ADDRESS
> =C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] splines tools stats graphics grDevices utils
> datasets
> [8] methods base
>
> other attached packages:
> [1] oligo_1.6.0 oligoClasses_1.4.0 affxparser_1.14.2
> [4] AnnotationDbi_1.4.3 preprocessCore_1.4.0 RSQLite_0.7-1
> [7] DBI_0.2-4 Biobase_2.2.2
>
>
> Any comments are welcome.
> Thanks in advance!
>
> PingHsun Hsieh
>
> [[alternative HTML version deleted]]
>
> <att00001.txt>