How to identify corrupt Affy CEL file?
0
0
Entering edit mode
@henrik-bengtsson-4333
Last seen 7 months ago
United States
Hi, your not the first one. A few months ago I transfered a large data set via an external HDD and like you it took a long time to notice the fact that some CEL files were corrupt - some how the CEL files were still valid and read just file. It was just some probe intensities that had ridiculous large values. I used MD5 on the files to identify which files were corrupted. As Seth suggested, the digest() function in the 'digest' package can be used for this. FYI: In August I will release aroma.affymetrix for analyzing small to very large Affymetrix data sets etc etc. Since I was bitten by the above bug, I added methods for generating and validating sets of CEL files via MD5. Cheers Henrik On 5/31/07, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Hi List, > > Does anyone know of a package/tool/script that allows checking the integrity of (Affymetrix CEL) files?? [e.g. using comparisons of MD5 checksums]? > > I am asking because when transferring a data set via FTP unexpectedly a CEL file became corrupt. Upon uploading the files are automatically analyzed in our pipeline. It took us quite some time to find out that the problem was caused by one faulty file out of 16 (and not something else). > > > > data <- ReadAffy() > Error in read.affybatch(filenames = l$filenames, phenoData = l$phenoData, : > Is D:/Guido/A42_7_Int_ko_wy.CEL really a CEL file? tried reading as text, gzipped text and binary > > > > > This is the first time it happened to us, but now I realized that it would be very useful if after transferring the integrity of the CEL file could be checked, allowing the immediate identification of corrupt files. > > Thanks, > Guido > > ------------------------------------------------ > Guido Hooiveld, PhD > Nutrition, Metabolism & Genomics Group > Division of Human Nutrition > Wageningen University > Biotechnion, Bomenweg 2 > NL-6703 HD Wageningen > the Netherlands > > internet: http://nutrigene.4t.com > email: guido.hooiveld at wur.nl > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
probe probe • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 773 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6