Question

CRLMM: Error in quantile.default(M, c(1, 5)/6, names = FALSE)

0

Entering edit mode

komal.rathi ▴ 120

@komalrathi-9163

Last seen 2.4 years ago

United States

Hi everyone,

I am working with crlmm package for the chip BDCHP-1X10-HUMANHAP550_11218540_C. The following is my code:

library(Biobase)
library(crlmm)
library(VanillaICE)
library(lattice)
library(ff)
library(illuminaio)
library("human550v3bCrlmm")
library("human610quadv1bCrlmm")

cnSet <- genotype.Illumina(sampleSheet=samplesheet,
                           arrayNames=arrayNames,
                           arrayInfoColNames=arrayInfo,
                           cdfName="human550v3b",
                           batch=batch)

Instantiate CNSet container.

path arg not set.  Assuming files are in local directory, or that complete path is provided

Initializing container for genotyping and copy number estimation

Processing sample stratum 1 of 4

'path' arg not set.  Assuming files are in local directory, or that complete path is provided in 'arrayNames'

Loading chip annotation information.

Quantile normalizing 100 arrays by 10 strips.

  |======================================================================| 100%

Loading snp annotation and mixture model parameters.

Calibrating 100 arrays.

  |                                                                      |   0%Error in quantile.default(M, c(1, 5)/6, names = FALSE) : 

  missing values and NaN's not allowed if 'na.rm' is FALSE

I tried to search where the genotype.Illumina function is using quantile.default so I can change the code, but I cannot find it. Has anyone faced this problem?

Thanks!

crlmm • 2.7k views

ADD COMMENT • link updated 8.7 years ago by Matthew Ritchie ▴ 1000 • written 9.5 years ago by komal.rathi ▴ 120

0

Entering edit mode

Hi,

Has anybody found a solution for this? I'm struggling with exactly the same problem, and seems like these people have it too:

In the last two links, there are answers suggesting that this is due to small sample sizes, but here the OP has n=100, and in my case, n=36. This is the output I get in R:

thedata <- genotype.Illumina(cdfName="humanomni25quadv1b", call.method='krlmm')
Instantiate CNSet container.
path arg not set.  Assuming files are in local directory, or that complete path is provided
Initializing container for genotyping and copy number estimation
Loading required package: humanomni25quadv1bCrlmm
Welcome to humanomni25quadv1bCrlmm version 1.0.2
Processing sample stratum 1 of 1
'path' arg not set.  Assuming files are in local directory, or that complete path is provided in 'arrayNames'
Loading chip annotation information.
Loading reference normalization information.
Quantile normalizing 36 arrays, one at a time.
  |======================================================================| 100%
Loading snp annotation and mixture model parameters.
Calibrating 36 arrays.
  |                                                                      |   0%Error in quantile.default(M, c(1, 5)/6, names = FALSE) :
  missing values and NaN's not allowed if 'na.rm' is FALSE

ADD REPLY • link 8.8 years ago Aldo • 0

0

Entering edit mode

Hi Aldo,

I ran into the same issue with a sample size of 175. I'm currently looking into the source of the problem, however, I think it's a sample-specific issue since my job keeps quitting at the 21% mark of "Calibrating ... arrays". Have you resolved this issue yet? If not, I'm looking into the source of the error as it seems to fail in the crlmm:::fitAffySnpMixture56() function.

ADD REPLY • link 8.7 years ago rquevedo • 0

0

Entering edit mode

I didn't find a solution. Just moved forward by running GenomeStudio instead. Things seem ok there.

Let me know if you figure this out.

ADD REPLY • link 8.7 years ago Aldo • 0

0

Entering edit mode

Hi, do you find some methods to solve this problem ?

ADD REPLY • link 7.2 years ago junyi • 0

score 0 · Answer 1 · 2017-06-04

0

Entering edit mode

Matthew Ritchie ▴ 1000

@matthew-ritchie-650

Last seen 21 months ago

Australia

Apologies for the delay in replying. It's not the most helpful error message, but in my experience it is usually due to a mismatch between the annotation and chip type which introduces NAs into the intensity matrix for probes that are not present.

If you send me a few idat files I can try and work out what version you have and the most appropriate package to specify.

ADD COMMENT • link 8.7 years ago Matthew Ritchie ▴ 1000

0

Entering edit mode

That's interesting. I'll try looking into that then. Unfortunately, I can't share any idat files since they were downloaded from a protected public dataset: https://www.ebi.ac.uk/ega/studies/EGAS00001000610

Although the authors of this paper (Klijn et al NBT 2015) state that they were assayed on Illumina HumanOmni2.5_4v1 arrays, it may be the case that some samples were assayed with a different array. I am noticing that while the majority of the samples are processed fine, I am getting this error for some specific samples.

**Update**

Thanks for the advice, the error was in fact thrown by NA's introduced to the intensity matrix. When looking at each red/green file independently using illuminaio package, some files were just completely mislabelled and didn't match the sampleID at all. However, other samples report the $ChipType as BeadChip 4x10, however, the $nSNPsRead for each differed. Because of the difference in the number of SNP Reads, the subsequent files analyzed in crlmm would fill in NA for probes that were not included in their set (i.e. First file reads in quant data for 2,624,666 SNPs. The second file reads in quant data for 2,623,923 SNPs. Because the second file is appending on the data frame "RG@assayData$R " established from the first file, it fills in NA values for all the missing SNPs).

Matthew Ritchie, would you happen to know why the number of SNPs would vary from file to file for the same ChipType? During the construction of the RG@assayData$R and $G data structures, it doesn't appear as if the annotation file is important at all currently. Would it be better to just set the first file to be read in as the one with the least amount of SNPs so there will be no NAs?

Thanks

ADD REPLY • link 8.7 years ago rquevedo • 0