Cannot load CEL files
1
0
Entering edit mode
josejotero • 0
@josejotero-14263
Last seen 7.2 years ago

I cannot open .CEL files without getting an error. I am using a new computer and can't find out why I have these settings. 

 

Here is the code:

data_eg <- ReadAffy("~/Downloads/GSE17204_RAW/GSM430339.CEL")
Error in read.celfile.header(as.character(filenames[[1]])) : 
  Could not open file ~/Downloads/GSE17204_RAW/GSM430339.CEL

 

Please help. I know that it is something stupid. I am running R v3.4.2.

 

software error • 3.7k views
ADD COMMENT
1
Entering edit mode

does file.exists("~/Downloads/GSE17204_RAW/GSM430339.CEL") yield TRUE?

ADD REPLY
0
Entering edit mode

It yeilded FALSE. So I copy and pasted the file to another directory, and then it yields TRUE.

Then, I tried this:

data <- ReadAffy("~/Documents/Projects/Catherine Astrocyte project/Transcriptional data/162-2129.CEL")
Error in read.celfile.header(as.character(filenames[[1]])) : 
  Could not open file ~/Documents/Projects/Catherine Astrocyte project/Transcriptional data/162-2129.CEL

 

--and then on another file---

ta <- ReadAffy("~/Documents/Projects/Catherine Astrocyte project/Playing with the affy data/GSM430341.CEL")
Error in read.celfile.header(as.character(filenames[[1]])) : 
  Could not open file ~/Documents/Projects/Catherine Astrocyte project/Playing with the affy data/GSM430341.CEL

 

 

ADD REPLY
1
Entering edit mode
@james-w-macdonald-5106
Last seen 18 hours ago
United States

The error you are getting is, for like 99.999% of all such errors, an indication that the file isn't where you are pointing. So two things. First, there is usually no profit in running R in one directory and operating on files in another directory. I would recommend utilizing the principle that the working directory contains all of your code and the files upon which the code operates as a general rule. Having things here and there is a recipe for unintended errors.

Second, there is no need to be downloading things by hand. You can use the GEOquery package to get what you need.

> library(GEOquery)

> getGEOSuppFiles("GSE17204")
https://ftp.ncbi.nlm.nih.gov/geo/series/GSE17nnn/GSE17204/suppl/
OK
trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE17nnn/GSE17204/suppl//GSE17204_RAW.tar'
Content type 'application/x-tar' length 18616320 bytes (17.8 MB)
downloaded 17.8 MB

trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE17nnn/GSE17204/suppl//filelist.txt'
Content type 'text/plain' length 524 bytes
downloaded 524 bytes

> setwd("GSE17204/")
> untar("GSE17204_RAW.tar")
> library(oligo)

> dat <- read.celfiles(dir(".", "CEL.gz"))
Loading required package: pd.hg.u133a.2
Loading required package: RSQLite
Loading required package: DBI
Platform design info loaded.
Reading in : GSM430339.CEL.gz
Reading in : GSM430340.CEL.gz
Reading in : GSM430341.CEL.gz
Reading in : GSM430342.CEL.gz
Reading in : GSM430343.CEL.gz
Reading in : GSM430344.CEL.gz
Reading in : GSM430345.CEL.gz
Reading in : GSM430346.CEL.gz
> eset <- rma(dat)
Background correcting
Normalizing
Calculating Expression
> eset
ExpressionSet (storageMode: lockedEnvironment)
assayData: 22277 features, 8 samples
  element names: exprs
protocolData
  rowNames: GSM430339.CEL.gz GSM430340.CEL.gz ... GSM430346.CEL.gz (8
    total)
  varLabels: exprs dates
  varMetadata: labelDescription channel
phenoData
  rowNames: GSM430339.CEL.gz GSM430340.CEL.gz ... GSM430346.CEL.gz (8
    total)
  varLabels: index
  varMetadata: labelDescription channel
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.hg.u133a.2
>
ADD COMMENT
0
Entering edit mode

I took your advice, and I get the following:

> getwd()
[1] "/Users/oter04/Rcodes"
> dir()
[1] "162-2129.CEL" "2133.CEL"    
> ReadAffy("~/Rcodes/162-2129.CEL")
Error in read.celfile.header(as.character(filenames[[1]])) : 
  Could not open file ~/Rcodes/162-2129.CEL

Also, when I requestthe GEOquery, I get this: 


> library(GEOquery)
Error in library(GEOquery) : there is no package called ‘GEOquery’
> source("https://bioconductor.org/biocLite.R")
Bioconductor version 3.5 (BiocInstaller 1.26.1), ?biocLite for help
> biocLite(GEOquery)
Error in "BiocUpgrade" %in% pkgs : object 'GEOquery' not found

ADD REPLY
0
Entering edit mode

I take my earlier statement back. The error you get is more likely due to lack of permissions.

> system("chmod 300 4105.CEL")
> system("ls -la 4105.CEL")
--wx------ 1 jmacdon adamusr-data5 14270951 Sep 22  2014 4105.CEL
> ReadAffy("4105.CEL")
Error in read.celfile.header(as.character(filenames[[1]])) :
  Could not open file 4105.CEL
> system("chmod 700 4105.CEL")
> ReadAffy("4105.CEL")
AffyBatch object
size of arrays=1190x1190 features (17 kb)
cdf=CynGene-1_0-st (??? affyids)
number of samples=1

So you probably need to change permissions on your files.

Also, from ?biocLite:

Arguments:

    pkgs: 'character()' of package names to install or update.  A
          missing value and 'suppressUpdates=FALSE' updates installed
          packages, perhaps also installing 'Biobase', 'IRanges', and
          'AnnotationDbi' if they are not already installed. Package
          names containing a '/' are treated as github repositories and
          installed using the 'install_github()' function of the
          'devtools' package.

You need to pass a character vector of packages you want to install to GEOquery, otherwise, the function thinks you are passing in an R object and tries to parse it.

ADD REPLY

Login before adding your answer.

Traffic: 913 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6