Question: CRLMM: Error in quantile.default(M, c(1, 5)/6, names = FALSE)
0
3.1 years ago by
komal.rathi80
United States
komal.rathi80 wrote:

Hi everyone,

I am working with crlmm package for the chip BDCHP-1X10-HUMANHAP550_11218540_C. The following is my code:

library(Biobase)
library(crlmm)
library(VanillaICE)
library(lattice)
library(ff)
library(illuminaio)
library("human550v3bCrlmm")

cnSet <- genotype.Illumina(sampleSheet=samplesheet,
arrayNames=arrayNames,
arrayInfoColNames=arrayInfo,
cdfName="human550v3b",
batch=batch)

Instantiate CNSet container.

path arg not set.  Assuming files are in local directory, or that complete path is provided

Initializing container for genotyping and copy number estimation

Processing sample stratum 1 of 4

'path' arg not set.  Assuming files are in local directory, or that complete path is provided in 'arrayNames'

Quantile normalizing 100 arrays by 10 strips.

|======================================================================| 100%

Calibrating 100 arrays.

|                                                                      |   0%Error in quantile.default(M, c(1, 5)/6, names = FALSE) :

missing values and NaN's not allowed if 'na.rm' is FALSE

I tried to search where the genotype.Illumina function is using quantile.default so I can change the code, but I cannot find it. Has anyone faced this problem?

Thanks!

crlmm • 758 views
modified 2.3 years ago by Matthew Ritchie750 • written 3.1 years ago by komal.rathi80

Hi,

Has anybody found a solution for this? I'm struggling with exactly the same problem, and seems like these people have it too:

In the last two links, there are answers suggesting that this is due to small sample sizes, but here the OP has n=100, and in my case, n=36. This is the output I get in R:

thedata <- genotype.Illumina(cdfName="humanomni25quadv1b", call.method='krlmm')
Instantiate CNSet container.
path arg not set.  Assuming files are in local directory, or that complete path is provided
Initializing container for genotyping and copy number estimation
Processing sample stratum 1 of 1
'path' arg not set.  Assuming files are in local directory, or that complete path is provided in 'arrayNames'
Quantile normalizing 36 arrays, one at a time.
|======================================================================| 100%
Calibrating 36 arrays.
|                                                                      |   0%Error in quantile.default(M, c(1, 5)/6, names = FALSE) :
missing values and NaN's not allowed if 'na.rm' is FALSE



Hi Aldo,

I ran into the same issue with a sample size of 175.  I'm currently looking into the source of the problem, however, I think it's a sample-specific issue since my job keeps quitting at the 21% mark of "Calibrating ... arrays".  Have you resolved this issue yet?  If not, I'm looking into the source of the error as it seems to fail in the crlmm:::fitAffySnpMixture56() function.

I didn't find a solution. Just moved forward by running GenomeStudio instead. Things seem ok there.

Let me know if you figure this out.

Hi, do you find some methods to solve this problem ?
Answer: CRLMM: Error in quantile.default(M, c(1, 5)/6, names = FALSE)
0
2.3 years ago by
Australia
Matthew Ritchie750 wrote:

Apologies for the delay in replying. It's not the most helpful error message, but in my experience it is usually due to a mismatch between the annotation and chip type which introduces NAs into the intensity matrix for probes that are not present.

If you send me a few idat files I can try and work out what version you have and the most appropriate package to specify.

That's interesting.  I'll try looking into that then.  Unfortunately, I can't share any idat files since they were downloaded from a protected public dataset: https://www.ebi.ac.uk/ega/studies/EGAS00001000610

Although the authors of this paper (Klijn et al NBT 2015) state that they were assayed on Illumina HumanOmni2.5_4v1 arrays, it may be the case that some samples were assayed with a different array.  I am noticing that while the majority of the samples are processed fine, I am getting this error for some specific samples.

**Update**

Thanks for the advice, the error was in fact thrown by NA's introduced to the intensity matrix.  When looking at each red/green file independently using illuminaio package, some files were just completely mislabelled and didn't match the sampleID at all.  However, other samples report the $ChipType as BeadChip 4x10, however, the$nSNPsRead  for each differed.  Because of the difference in the number of SNP Reads, the subsequent files analyzed in crlmm would fill in NA for probes that were not included in their set (i.e. First file reads in quant data for 2,624,666 SNPs.  The second file reads in quant data for 2,623,923 SNPs.  Because the second file is appending on the data frame "RG@assayData$R " established from the first file, it fills in NA values for all the missing SNPs). Matthew Ritchie, would you happen to know why the number of SNPs would vary from file to file for the same ChipType? During the construction of the RG@assayData$R and \$G data structures, it doesn't appear as if the annotation file is important at all currently.  Would it be better to just set the first file to be read in as the one with the least amount of SNPs so there will be no NAs?

Thanks