cant read celfiles using oligo package
1
0
Entering edit mode
scipio04 • 0
@a6b1f305
Last seen 5 weeks ago
France

Dear, i have error isuees trying to read celfiles

im sure that im in the right directory containing it i have unziped a zip file containing 386 .cel files

here 's the error message :

• Is 5513234437813052023349_Pmarg70K_A01_FIE9191A1537_S_B16.1.CEL really a CEL file? tried reading as text, gzipped text, binary, gzipped binary, command console and gzipped command console formats

i dont understand the issue here , thanks for your help !

setwd()
library(oligo)
celfiles <- list.files (pattern = ".CEL")

oligo • 455 views
0
Entering edit mode

What do you want to do with this Affymetrix SNP 6.0 data? My PhD was based on this array back in 2010-12, and I know that it has probes that target both SNP and CN (copy number) variants. In which are you interested? The Affymetrix SNP 6.0 is not for gene expression analysis.

0
Entering edit mode

i want to get a matrix in order to do population genomic analysis after

the answer i received was about the right package to use but once i corrected it , im still dealing with the same errors issues

0
Entering edit mode

No problem. Can you take a look at the second answer that I posted above? - the user was also using oligo but then moved to crlmm. It seems like there is a way to make genotype calls with this package, as per also this publication (see code toward end): Using the R Package crlmm for Genotyping and Copy Number Estimation

From my first answer that I posted, probably Birdsuite (https://www.broadinstitute.org/birdsuite) is the best option to use, for now, in order to derive genotypes.

0
Entering edit mode

ok thank you for your response , im gonna take a look on crlmm package and birdsuite pipeline

0
Entering edit mode
@james-w-macdonald-5106
Last seen 17 hours ago
United States

The error message indicates that one or more of the celfiles are problematic. These days most celfiles are binary, and if you have one or more that are corrupted (could be just from unzipping), the oligo package (well, actually the affxparser package) won't be able to read it. The problem here is that you don't know if it's just one of your files or all of them. So you could randomly do something like

library(affxparser)


To check them one by one. But that's super boring, and maybe it's just one or two, in which case doing that for 386 files is super duper boring. You can instead use Martin Maechler's tryCatch.W.E to catch errors without actually erroring out, to iterate through your celfiles and see which one(s) are problematic.

tryCatch.W.E <- function(expr)
{
W <- NULL
w.handler <- function(w){ # warning handler
W <<- w
invokeRestart("muffleWarning")
}
list(value = withCallingHandlers(tryCatch(expr, error = function(e) e),
warning = w.handler),
warning = W)
}

celfiles[sapply(z, function(x) !is.null(x$warning))]  Which will provide a list of the borked celfiles. 0 Entering edit mode thank you for your comment , indeed when i try " read.celfile " for random files , it works but when i try it for all , it does'nt im gonna try the tryCatch.W.E solution ADD REPLY 0 Entering edit mode tryCatch.W.E return character(0) , seems that nothing is wrong but it's weird like sometimes i can read one cel file alone well and store it into a vector but when i try to use some other functions of oligo or clrrmm package , it display the same error ADD REPLY 0 Entering edit mode Huh. Weird. You wouldn't think something like this would be random - either you can or you cannot read in a file. Where did you get the files? ADD REPLY 0 Entering edit mode the files come from https://mydata.ramaciotti.unsw.edu.au/s/96s5HDbb2z83Zn8 ramaciotti center for genomics in sydney read.celfiles function can work for some files but when i some function from oligo/crlmn R packages , the both errors wrote above come again look like something's wrong with files header , but as i dont know anything about cel files and binaries ones i can't get it ADD REPLY 0 Entering edit mode I encountered one of the problematic ones, James, but affyio could read it. Seems to be Axiom obj <- affyio::read.celfile('5513234437813052023349_Pmarg70K_A01_FIE9191A1537_S_B16.1.CEL') str(obj) List of 6$ HEADER      :List of 9
..$cdfName : chr "Axiom_Pmarg70k" ..$ CEL dimensions     : int [1:2] 389 389
..$GridCornerUL : int [1:2] 0 0 ..$ GridCornerUR       : int [1:2] 388 0
..$GridCornerLR : int [1:2] 388 388 ..$ GridCornerLL       : int [1:2] 0 388
..$DatHeader : chr "" ..$ Algorithm          : chr "HT Image Calibration Cell Generation"
..$AlgorithmParameters: chr "Percentile:75;CellMargin:4;OutlierHigh:1.500000;OutlierLow:1.004000;AlgVersion:;FixedCellSize:TRUE;FullFeatureW"| __truncated__ affy::ReadAffy('5513234437813052023349_Pmarg70K_A01_FIE9191A1537_S_B16.1.CEL') Error in read.celfile.header(as.character(filenames[[1]])) : Is 5513234437813052023349_Pmarg70K_A01_FIE9191A1537_S_B16.1.CEL really a CEL file? tried reading as text, gzipped text, binary, gzipped binary, command console and gzipped command console formats oligo::read.celfiles('5513234437813052023349_Pmarg70K_A01_FIE9191A1537_S_B16.1.CEL') Error in read.celfile.header(x) : Is 5513234437813052023349_Pmarg70K_A01_FIE9191A1537_S_B16.1.CEL really a CEL file? tried reading as text, gzipped text, binary, gzipped binary, command console and gzipped command console formats  ADD REPLY 0 Entering edit mode some functions works , and others don't , what a curious thing i tried with other data celfiles and it's working the problem obviously come from my data files ADD REPLY 0 Entering edit mode meanwhil im exploring the R packages with a cel data file that work i can't find the ' mapping250knspCrlmm ' packages , either on bioconductor or in R CRAN do you have any idea how to get it ? many functions do'esnt work without it ADD REPLY 1 Entering edit mode I downloaded all the files, and here's the results > getwd() [1] "E:/FIE9191_PMARG70K_2021_RESULTS/FIE9191_PMARG70K_2021_RESULTS" > dirs <- dir() > fls <- lapply(dirs, dir, full.names = TRUE) > fls2 <- do.call(c, fls) > huh <- lapply(fls2, function(x) tryCatch.W.E(read.celfile(x)$HEADER$cdfName)) ## somehow this doesn't do all the files? > huhhuh <- lapply(fls2[1159:1930], function(x) tryCatch.W.E(read.celfile(x)$HEADER$cdfName)) > huhall <- c(huh, huhhuh) > badfls <- fls2[sapply(huhall, function(x) is(x$value, "simpleError"))]
[1] "FIE9191_Pmarg70K_3348_P13-16_RESULTS/FIE9191_Pmarg70K_3348_Plates13-16_BP_Workflow_QC_rpt.pdf"
[2] "FIE9191_Pmarg70K_3348_P13-16_RESULTS/FIE9191_Pmarg70K_3348_Plates13-16_BP_Workflow_QC_table_rpt.txt"
[3] "FIE9191_Pmarg70k_3349_P17-20_RESULTS/FIE9191_Pmarg70K_3349_Plates17-16_0_Workflow_QC_table_rpt.txt"
[4] "FIE9191_Pmarg70k_3349_P17-20_RESULTS/FIE9191_Pmarg70K_3349_Plates17-20_BP_Workflow_QC_rpt.pdf"
[5] "FIE9191_Pmarg70k_3350_P1-4_RESULTS/FIE9191_Pmarg70K_3350_Plates1-4_BP_Workflow_QC_rpt.pdf"
[6] "FIE9191_Pmarg70k_3350_P1-4_RESULTS/FIE9191_Pmarg70K_3350_Plates1-4_BP_Workflow_QC_table_rpt.txt"
[7] "FIE9191_Pmarg70k_3351_P5-8_RESULTS/FIE9191_Pmarg70K_3351_Plates5-8_BP_Workflow_QC_rpt.pdf"
[8] "FIE9191_Pmarg70k_3351_P5-8_RESULTS/FIE9191_Pmarg70K_3351_Plates5-8_BP_Workflow_QC_table_rpt.txt"
[9] "FIE9191_Pmarg70k_3352_P9-12_RESULTS/FIE9191_Pmarg70K_3352_Plates9-12_BP_Workflow_QC_rpt.pdf"
[10] "FIE9191_Pmarg70k_3352_P9-12_RESULTS/FIE9191_Pmarg70K_3352_Plates9-12_BP_Workflow_QC_table_rpt.txt"
> table(sapply(huhall, function(x) if(!is(x$value, "simpleError")) return(x$value) else return(NA)))

Axiom_Pmarg70k
1920


Apparently you have 1920 Axiom Pyrus 70K SNP arrays, and 10 pdf or txt files. And for some reason read.celfile.header won't read them

> what <- lapply(fls2, function(x) tryCatch.W.E(read.celfile.header(x)))
> sum(sapply(what, function(x) is(x\$value, "simpleError")))
[1] 1930


Without fixing that problem, you won't be able to use oligo or crlmm to analyze these data. In addition, the affxparser package can't read these thing at all, and that package is base on Affy's Calvin software, so if anything should be able to read them it's affxparser.

Long story short, there is like a 0% likelihood that this will be fixed in Bioconductor. The person who wrote affyio hasn't been involved for maybe 15 years now, and while the two main authors of affxparser are still around, I sort of doubt getting it to read some Axiom files is not near the top of their TODO list. I would recommend using the Affy software to get the genotype calls, and then you can use R for further analysis if you want.

0
Entering edit mode

my purpose is to get a matrix from these files do you know how to do so ?

by affy software you mean thae affy package right

0
Entering edit mode

No, by Affy software I mean software provided by Affymetrix. I would imagine the Axiom Analysis Suite is what you want, but it's been years since I've used their software, and it's completely off-topic for this site, so I am afraid you are on your own for that. But perhaps you can get help from Fisher.

0
Entering edit mode

im trying axiom analysis suite and look easy to use for genotype calling

but the thing is i'm not sure if there is a liibrary available for my data type ( 70k arrays)