Affymetrix ReadAffy() can't read my CEL file
1
0
Entering edit mode
kanacska ▴ 10
@kanacska-7375
Last seen 8.6 years ago
Hungary

Hi!

I'm using R-3.1.2 in R studio. 

I wan't to use ReadAffy() , but it doesn't work

> # Read in the CEL files in the directory
> raw <- ReadAffy(compress=gzipped, sampleNames = sNames)
Error in value[[3L]](cond) : row names contain missing values
  AnnotatedDataFrame 'initialize' could not update varMetadata:
  perhaps pData and varMetadata are inconsistent?

I'm working with datasets from here http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE32474  (with NCI_ADR_RES and OV:OVCAR_8 samples (3-3 samples)

 

> sessionInfo()

R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Hungarian_Hungary.1250  LC_CTYPE=Hungarian_Hungary.1250   
[3] LC_MONETARY=Hungarian_Hungary.1250 LC_NUMERIC=C                      
[5] LC_TIME=Hungarian_Hungary.1250    

attached base packages:
 [1] splines   grid      parallel  stats     graphics  grDevices utils    
 [8] datasets  methods   base     

other attached packages:
 [1] pheatmap_0.7.7             affyQCReport_1.44.0       
 [3] simpleaffy_2.42.0          IDPmisc_1.1.17            
 [5] arrayQualityMetrics_3.22.1 Formula_1.2-0             
 [7] survival_2.37-7            lattice_0.20-29           
 [9] gridSVG_1.4-3              genefilter_1.48.1         
[11] Cairo_1.5-6                affyPLM_1.42.0            
[13] preprocessCore_1.28.0      gcrma_2.38.0              
[15] affy_1.44.0                beadarray_2.16.0          
[17] ggplot2_1.0.0              BBmisc_1.9                
[19] Biobase_2.26.0             BiocGenerics_0.12.1       
[21] BiocInstaller_1.16.1      

loaded via a namespace (and not attached):
 [1] acepack_1.3-3.3      affyio_1.34.0        annotate_1.44.0     
 [4] AnnotationDbi_1.28.1 base64_1.1           BeadDataPackR_1.18.0
 [7] Biostrings_2.34.1    checkmate_1.5.1      cluster_2.0.1       
[10] colorspace_1.2-4     DBI_0.3.1            digest_0.6.8        
[13] foreign_0.8-62       GenomeInfoDb_1.2.4   GenomicRanges_1.18.4
[16] gtable_0.1.2         Hmisc_3.15-0         hwriter_1.3.2       
[19] illuminaio_0.8.0     IRanges_2.0.1        latticeExtra_0.6-26 
[22] limma_3.22.4         MASS_7.3-37          munsell_0.4.2       
[25] nnet_7.3-9           plyr_1.8.1           proto_0.3-10        
[28] RColorBrewer_1.1-2   Rcpp_0.11.4          reshape2_1.4.1      
[31] RJSONIO_1.3-0        rpart_4.1-9          RSQLite_1.0.0       
[34] S4Vectors_0.4.0      scales_0.2.4         setRNG_2013.9-1     
[37] stats4_3.1.2         stringr_0.6.2        SVGAnnotation_0.93-1
[40] tools_3.1.2          vsn_3.34.0           XML_3.98-1.1        
[43] xtable_1.7-4         XVector_0.6.0        zlibbioc_1.12.0     

Thank,

Anna

microarray affy cancer hgu133plus2 package • 1.9k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 3 hours ago
United States

You may have some files that got corrupted somehow, as I can process all the files for that GSE:

> getGEOSuppFiles("GSE32474")
ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE32nnn/GSE32474/suppl/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  789M  100  789M    0     0  11.6M      0  0:01:08  0:01:08 --:--:-- 7539k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 12099  100 12099    0     0   7556      0  0:00:01  0:00:01 --:--:--  7557

> setwd("GSE32474/")
> untar("GSE32474_RAW.tar")

> library(affy)
> dat <- ReadAffy()
>

You might try downloading again to see if that helps. You could also try reading in each file separately, to see if it is just a particular file that is corrupted, and then just download that one.

ADD COMMENT
0
Entering edit mode

Hi James,

Thank you! 

I will try that!

I didn't wright it in my first question that, is it a problem if I only downloaded the GSM files which have the data for my two cell line and not the whole GSE32474 .RAW fájl?

ADD REPLY
0
Entering edit mode

Ah, I think the error is in your sNames vector:

> dat <- ReadAffy(filenames = dir(".","cel.gz$")[1:6], sampleNames = c(paste0("Sample",1:5), NA))
Error in value[[3L]](cond) : row names contain missing values
  AnnotatedDataFrame 'initialize' could not update varMetadata:
  perhaps pData and varMetadata are inconsistent?

 

So it looks like there is either an NA to begin with, or one or more of the sampleNames you are passing in are getting munged to NA by this code in AllButCelsForReadAffy():

if (length(sampleNames) != length(filenames)) {
        warning("sampleNames not same length as filenames. Using filenames as sampleNames instead\n")
        sampleNames <- sub("^/?([^/]*/)*", "", filenames)
    }

So you might want to just read in without specifying the sampleNames, and if you want to change after the fact you can do that then.

ADD REPLY

Login before adding your answer.

Traffic: 727 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6