Hi,
I'm attempting to download the datasets that go with the paper "Clinical utility of microarray-based gene expression profiling in the diagnosis and subclassification of leukemia: report from the International Microarray Innovations in Leukemia Study Group" but I am having some difficulty with the R package GEOQuery.
There are two datasets (stage 1 and 2) associated with the research paper, and the and I've been using these commands to download them both:
>library(GEOquery) >library(foreign) >u <- getGEO('GSE13204')
Unfortunately I get this error:
Welcome to Bioconductor Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'. Setting options('download.file.method.GEOquery'='auto') Setting options('GEOquery.inmemory.gpl'=FALSE) https://ftp.ncbi.nlm.nih.gov/geo/series/GSE13nnn/GSE13204/matrix/ OK Found 2 file(s) GSE13204-GPL570_series_matrix.txt.gz trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE13nnn/GSE13204/matrix/GSE13204-GPL570_series_matrix.txt.gz' Content type 'application/x-gzip' length 886530886 bytes (845.5 MB) ================================================== downloaded 845.5 MB Error in read.table(con, sep = "\t", header = FALSE, nrows = nseries) : invalid 'nlines' argument In addition: There were 50 or more warnings (use warnings() to see the first 50)
The warnings are all identical:
> warnings() Warning messages: 1: In readLines(fname) : line 1 appears to contain an embedded nul 2: In readLines(fname) : line 216 appears to contain an embedded nul
etc
I currently have no idea whether this is problem with the data or the R package. Can anyone help?