Question

CRLMM on Illumina Human1Mv1_C chips

0

Entering edit mode

René ▴ 30

@rene-5748

Last seen 6.9 years ago

Netherlands

Dear all,

I am trying to analyze some Illumina beadarray SNP data (human1mv1c) using crlmm and I keep running into the error shown below. Since my previous attempts on a second, independent cohort (using the human1mduov3b chip) succeeded, my guess is that the problem resides somewhere in the analysis setup and not the function itself. Furthermore, I saw an earlier post with the same issue and error, which was due to the fact that unsupported Illumina GoldenGate chips were used:

Crlmm package

Another post with a similar error was non-conclusive:

crlmm : copy number and genotyping of Illumina data

Thank you in advance, any help is greatly appreciated.

library(crlmm)
library(ff)
sampleSXS = read.csv("SXS_sample_map.csv",header=TRUE, as.is=TRUE)
sampleSXS[1:5,]
  Sample_ID SentrixBarcode SentrixPosition Gender
  E320-45N 4040350100 A M
  E320-45T 4040350189 A M
  E321-39N 4040350200 A M
  E321-39T 4040350169 A M
  E323-08N 4040350176 A M

arrayInfo = list(barcode=NULL, position="SentrixPosition")
cdfName = "human1mv1c"
batch = rep("1", nrow(sampleSXS))
arrayNames = paste(sampleSXS[,2],sampleSXS[,3],sep="_")
cnSetSXS = genotype.Illumina(sampleSheet=sampleSXS, arrayNames=arrayNames, arrayInfoColNames=arrayInfo,cdfName=cdfName,batch=batch)

Instantiate CNSet container.
path arg not set.  Assuming files are in local directory, or that complete path is provided
Initializing container for genotyping and copy number estimation
Processing sample stratum 1 of 1
path arg not set.  Assuming files are in local directory, or that complete path is provided
Quantile normalizing 24 arrays by 20 strips.
  |======================================================================| 100%
Calibrating 24 arrays.
  |                                                                      |   0%Error in chol.default(crossprod(sweep(matS, 1, z[, 1], FUN = "*"), matS)) :
  the leading minor of order 1 is not positive definite


sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] human1mv1cCrlmm_1.0.3    human1mduov3bCrlmm_1.0.4 ff_2.2-13               
[4] bit_1.1-12               crlmm_1.24.0             preprocessCore_1.28.0   
[7] oligoClasses_1.28.0     

loaded via a namespace (and not attached):
 [1] affyio_1.34.0        base64_1.2           Biobase_2.26.0      
 [4] BiocGenerics_0.12.1  BiocInstaller_1.16.1 Biostrings_2.34.1   
 [7] codetools_0.2-10     ellipse_0.3-8        foreach_1.4.2       
[10] GenomeInfoDb_1.2.4   GenomicRanges_1.18.4 grid_3.1.2          
[13] illuminaio_0.8.0     IRanges_2.0.1        iterators_1.0.7     
[16] lattice_0.20-30      Matrix_1.1-5         matrixStats_0.14.0  
[19] mvtnorm_1.0-2        parallel_3.1.2       Rcpp_0.11.4         
[22] RcppEigen_0.3.2.3.0  S4Vectors_0.4.0      splines_3.1.2       
[25] stats4_3.1.2         tools_3.1.2          VGAM_0.9-6          
[28] XVector_0.6.0        zlibbioc_1.12.0

crlmm illumina beadarraysnp • 2.1k views

ADD COMMENT • link updated 9.8 years ago by Matthew Ritchie ▴ 1000 • written 9.8 years ago by René ▴ 30

score 1 · Answer 1 · 2015-03-05

Hi Rene,

I just tried running Human 1Mv1c HapMap data though crlmm and it seemed to work OK (I don't have idats for this platform, so have used output from GenomeStudio - see commands below. If you have GenomeStudio output you could try running similar commands and see if that works? Note that you need to have 'X Raw' and 'Y Raw' available in the file - these are the values extracted from the idat files via Red and Green channels and then matched up to SNP IDs to get X and Y intensities).

If you'd like to take a look at this example data set, send me an email off list and I'll put it online somewhere for you to pick up.

Best wishes,

Matt

library(crlmm)
library(ff)
XY = readGenCallOutput("HumanHap1MDuo_FinalReport_Norm.csv", cdfName="human1mv1c",
                        colnames=list(SampleID = "Sample Name", SNPID="SNP Name", 
                        XRaw = "X Raw", YRaw = "Y Raw"))
calls = genotype.Illumina(XY=XY, cdfName="human1mv1c")
Instantiate CNSet container.
Initializing container for genotyping and copy number estimation
Loading required package: human1mv1cCrlmm
Welcome to human1mv1cCrlmm version 1.0.3
Processing sample stratum 1 of 1
Quantile normalizing 39 arrays by 20 strips.
  |======================================================================| 100%
Loading snp annotation and mixture model parameters.
Calibrating 39 arrays.
  |======================================================================| 100%
Finished preprocessing.
Preprocessing complete.  Begin genotyping...
Calling 1047391 SNPs for recalibration...
Loading annotations.
Using 20000 SNPs on chrom X and Y to assign gender.
Imputing gender
finished process1
Done.
Estimating recalibration parameters.
Filling out empty centers..........................................
Calculating and standardizing size of shift... OK
Calling 1047391 SNPs... Done with process2
Done.
Genotyping finished.

sessionInfo()

R version 3.1.1 Patched (2014-10-16 r66782)

Platform: x86_64-unknown-linux-gnu (64-bit)

locale:

[1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C

[3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8

[5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8

[7] LC_PAPER=en_AU.UTF-8 LC_NAME=C

[9] LC_ADDRESS=C LC_TELEPHONE=C

[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C

attached base packages:

[1] stats graphics grDevices datasets utils methods base

other attached packages:

[1] human1mv1cCrlmm_1.0.3 ff_2.2-13 bit_1.1-12

[4] crlmm_1.24.0 preprocessCore_1.28.0 oligoClasses_1.28.0

loaded via a namespace (and not attached):

[1] affyio_1.34.0 base64_1.1 Biobase_2.26.0

[4] BiocGenerics_0.12.1 BiocInstaller_1.16.1 Biostrings_2.34.1

[7] codetools_0.2-9 ellipse_0.3-8 foreach_1.4.2

[10] GenomeInfoDb_1.2.4 GenomicRanges_1.18.4 grid_3.1.1

[13] illuminaio_0.8.0 IRanges_2.0.1 iterators_1.0.7

[16] lattice_0.20-29 Matrix_1.1-4 matrixStats_0.14.0

[19] mvtnorm_1.0-2 parallel_3.1.1 Rcpp_0.11.4

[22] RcppEigen_0.3.2.4.0 S4Vectors_0.4.0 splines_3.1.1

[25] stats4_3.1.1 tools_3.1.1 VGAM_0.9-6

[28] XVector_0.6.0 zlibbioc_1.12.0

score 0 · Answer 2 · 2015-03-03

0

Entering edit mode

Matthew Ritchie ▴ 1000

@matthew-ritchie-650

Last seen 7 months ago

Australia

Hi Rene,

From memory, this error can occur when there is a mismatch between the actual chip type used and the platform specified.

I think the human1mv1c is is a pretty old platform - is it possible your data is humanomni1quad?

Cheers,

Matt

ADD COMMENT • link 9.8 years ago Matthew Ritchie ▴ 1000

0

Entering edit mode

Hi Matt,

That is also what I had in mind at first, however, the files I have are indeed pretty old (early 2008) and the sample sheet specifies the platform as human1mv1c. Do you know of any other way to test which chip has been used or other potential causes for this error?

Best,

René

ADD REPLY • link 9.8 years ago René ▴ 30

0

Entering edit mode

Hi Rene,

If you have access to a .sdf file that accompanies the data that might give you some hints as to what platform it is. In my experience they don't tend to have an explicit <ChipType> tag, but they do have lots of information such as the number of distinct probes, dimensions of the bead grid etc that might help you match or exclude certain chip type.

Mike

ADD REPLY • link 9.8 years ago Mike Smith ★ 6.6k

0

Entering edit mode

Hi Mike,

I do have an .sdf file available, however, I doubt that I can make sense out of the grid size as I am not that experienced with SNP arrays and their characteristics. The only 'useful' information I saw in the file can be found below, if you have any idea which chip type is related to this, please let me know.

René

<PhysicalProperties>
    <PhysicalWidth>25000</PhysicalWidth>
    <PhysicalHeight>82500</PhysicalHeight>
    <SectionWidth>7080</SectionWidth>
    <SectionHeight>1650</SectionHeight>
    <SampleWidth>17375</SampleWidth>
    <SampleHeight>55570</SampleHeight>
...
<ManufacturingProperties>
    <FlowCell>ILD</FlowCell>
    <PartNumber>206666</PartNumber>
    <Rev>C</Rev>
    <Description>slide 1x40 3.41,5.2</Description>
    <Platform>ILD</Platform>
    <Type>Slide</Type>
...
<Name>BeadChip 1x40 66</Name>
<Comment>3.445 um gap</Comment>
<Class>Slide</Class>
<AssayType>Infinium II-40</AssayType>

ADD REPLY • link 9.8 years ago René ▴ 30