ReadAffy() can't read CEL files from TCGA...
2
0
Entering edit mode
Paul Geeleher ★ 1.3k
@paul-geeleher-2679
Last seen 9.7 years ago
Hi, I downloaded some glioblastoma hg-u133a CEL files from The Cancer Genome Atlas (http://cancergenome.nih.gov/) and ReadAffy() can't seem to load the files. The CEL files are only 5.3 megs each so maybe they are compressed or something? And when you open them they look like binary files? I'm wondering if anybody knows what kind of files these are and how I might read them? ReadAffy() gives the following error: > a <- ReadAffy() Error in value[[3L]](cond) : row.names should specify one of the variables AnnotatedDataFrame 'initialize' could not update varMetadata: perhaps pData and varMetadata are inconsistent? -- Paul Geeleher School of Mathematics, Statistics and Applied Mathematics National University of Ireland Galway Ireland -- www.bioinformaticstutorials.com
• 1.7k views
ADD COMMENT
0
Entering edit mode
Tim Triche ★ 4.2k
@tim-triche-3561
Last seen 3.7 years ago
United States
Post the URL from which you are downloading them. Depending on the format, level, etc. the data can be totally different from what Bioconductor would expect. On Fri, Oct 8, 2010 at 9:57 AM, Paul Geeleher <paulgeeleher@gmail.com>wrote: > Hi, > > I downloaded some glioblastoma hg-u133a CEL files from The Cancer > Genome Atlas (http://cancergenome.nih.gov/) and ReadAffy() can't seem > to load the files. The CEL files are only 5.3 megs each so maybe they > are compressed or something? And when you open them they look like > binary files? > > I'm wondering if anybody knows what kind of files these are and how I > might read them? ReadAffy() gives the following error: > > > a <- ReadAffy() > Error in value[[3L]](cond) : > row.names should specify one of the variables > AnnotatedDataFrame 'initialize' could not update varMetadata: > perhaps pData and varMetadata are inconsistent? > > > -- > Paul Geeleher > School of Mathematics, Statistics and Applied Mathematics > National University of Ireland > Galway > Ireland > -- > www.bioinformaticstutorials.com > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- With four parameters I can fit an elephant, and with five I can make him wiggle his trunk. John von Neumann [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Apologies for the lack of info, I thought there'd probably be something very obvious I was doing wrong. These files are from the BI__HT_HG-U133A archive of the glioblastoma dataset. I've uploaded one of the files here: frink.nuigalway.ie/~pat/5500024030700072107989.G03.CEL > sessionInfo() R version 2.11.1 (2010-05-31) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=en_IE.utf8 LC_NUMERIC=C [3] LC_TIME=en_IE.utf8 LC_COLLATE=en_IE.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_IE.utf8 [7] LC_PAPER=en_IE.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_IE.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] affy_1.24.0 Biobase_2.6.0 loaded via a namespace (and not attached): [1] affyio_1.14.0 preprocessCore_1.8.0 > On Fri, Oct 8, 2010 at 10:03 PM, Tim Triche <tim.triche at="" gmail.com=""> wrote: > Post the URL from which you are downloading them. > Depending on the format, level, etc. the data can be totally different from > what Bioconductor would expect. > > On Fri, Oct 8, 2010 at 9:57 AM, Paul Geeleher <paulgeeleher at="" gmail.com=""> > wrote: >> >> Hi, >> >> I downloaded some glioblastoma hg-u133a CEL files from The Cancer >> Genome Atlas (http://cancergenome.nih.gov/) and ReadAffy() can't seem >> to load the files. The CEL files are only 5.3 megs each so maybe they >> are compressed or something? And when you open them they look like >> binary files? >> >> I'm wondering if anybody knows what kind of files these are and how I >> might read them? ReadAffy() gives the following error: >> >> > a <- ReadAffy() >> Error in value[[3L]](cond) : >> ?row.names should specify one of the variables >> ?AnnotatedDataFrame 'initialize' could not update varMetadata: >> ?perhaps pData and varMetadata are inconsistent? >> >> >> -- >> Paul Geeleher >> School of Mathematics, Statistics and Applied Mathematics >> National University of Ireland >> Galway >> Ireland >> -- >> www.bioinformaticstutorials.com >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > -- > With four parameters I can fit an elephant, and with five I can make him > wiggle his trunk. > John von Neumann > -- Paul Geeleher School of Mathematics, Statistics and Applied Mathematics National University of Ireland Galway Ireland -- www.bioinformaticstutorials.com
ADD REPLY
0
Entering edit mode
On 10/09/2010 10:01 AM, Paul Geeleher wrote: > Apologies for the lack of info, I thought there'd probably be > something very obvious I was doing wrong. > > These files are from the BI__HT_HG-U133A archive of the glioblastoma dataset. > > I've uploaded one of the files here: > frink.nuigalway.ie/~pat/5500024030700072107989.G03.CEL > > >> sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-pc-linux-gnu > > locale: > [1] LC_CTYPE=en_IE.utf8 LC_NUMERIC=C > [3] LC_TIME=en_IE.utf8 LC_COLLATE=en_IE.utf8 > [5] LC_MONETARY=C LC_MESSAGES=en_IE.utf8 > [7] LC_PAPER=en_IE.utf8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_IE.utf8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] affy_1.24.0 Biobase_2.6.0 These are not the correct versions of affy & Biobase for your R, see http://bioconductor.org/install/index.html#update-bioconductor- packages for update and http://bioconductor.org/help/bioc-views/release/bioc/ for released software (and their versions, on the individual package pages) ReadAffy() works for me with this sessionInfo(): > sessionInfo() R version 2.11.1 Patched (2010-08-30 r52862) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] affy_1.26.1 Biobase_2.8.0 loaded via a namespace (and not attached): [1] affyio_1.16.0 preprocessCore_1.10.0 > ReadAffy("5500024030700072107989.G03.CEL") AffyBatch object size of arrays=744x744 features (16 kb) cdf=HT_HG-U133A (22277 affyids) number of samples=1 number of genes=22277 annotation=hthgu133a notes= > Martin > > loaded via a namespace (and not attached): > [1] affyio_1.14.0 preprocessCore_1.8.0 >> > > > > > > > On Fri, Oct 8, 2010 at 10:03 PM, Tim Triche <tim.triche at="" gmail.com=""> wrote: >> Post the URL from which you are downloading them. >> Depending on the format, level, etc. the data can be totally different from >> what Bioconductor would expect. >> >> On Fri, Oct 8, 2010 at 9:57 AM, Paul Geeleher <paulgeeleher at="" gmail.com=""> >> wrote: >>> >>> Hi, >>> >>> I downloaded some glioblastoma hg-u133a CEL files from The Cancer >>> Genome Atlas (http://cancergenome.nih.gov/) and ReadAffy() can't seem >>> to load the files. The CEL files are only 5.3 megs each so maybe they >>> are compressed or something? And when you open them they look like >>> binary files? >>> >>> I'm wondering if anybody knows what kind of files these are and how I >>> might read them? ReadAffy() gives the following error: >>> >>>> a <- ReadAffy() >>> Error in value[[3L]](cond) : >>> row.names should specify one of the variables >>> AnnotatedDataFrame 'initialize' could not update varMetadata: >>> perhaps pData and varMetadata are inconsistent? >>> >>> >>> -- >>> Paul Geeleher >>> School of Mathematics, Statistics and Applied Mathematics >>> National University of Ireland >>> Galway >>> Ireland >>> -- >>> www.bioinformaticstutorials.com >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> >> -- >> With four parameters I can fit an elephant, and with five I can make him >> wiggle his trunk. >> John von Neumann >> > > > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
ADD REPLY
0
Entering edit mode
Ok thanks Martin that worked. That was rather stupid of me, apologies for wasting everyones time! On Sat, Oct 9, 2010 at 6:22 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: > On 10/09/2010 10:01 AM, Paul Geeleher wrote: >> Apologies for the lack of info, I thought there'd probably be >> something very obvious I was doing wrong. >> >> These files are from the BI__HT_HG-U133A archive of the glioblastoma dataset. >> >> I've uploaded one of the files here: >> frink.nuigalway.ie/~pat/5500024030700072107989.G03.CEL >> >> >>> sessionInfo() >> R version 2.11.1 (2010-05-31) >> x86_64-pc-linux-gnu >> >> locale: >> ?[1] LC_CTYPE=en_IE.utf8 ? ? ? LC_NUMERIC=C >> ?[3] LC_TIME=en_IE.utf8 ? ? ? ?LC_COLLATE=en_IE.utf8 >> ?[5] LC_MONETARY=C ? ? ? ? ? ? LC_MESSAGES=en_IE.utf8 >> ?[7] LC_PAPER=en_IE.utf8 ? ? ? LC_NAME=C >> ?[9] LC_ADDRESS=C ? ? ? ? ? ? ?LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_IE.utf8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> >> other attached packages: >> [1] affy_1.24.0 ? Biobase_2.6.0 > > These are not the correct versions of affy & Biobase for your R, see > > http://bioconductor.org/install/index.html#update-bioconductor- packages > > for update > > and > > http://bioconductor.org/help/bioc-views/release/bioc/ > > for released software (and their versions, on the individual package pages) > > ReadAffy() works for me with this sessionInfo(): > >> sessionInfo() > R version 2.11.1 Patched (2010-08-30 r52862) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C > ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 > ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8 > ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C > ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > other attached packages: > [1] affy_1.26.1 ? Biobase_2.8.0 > > loaded via a namespace (and not attached): > [1] affyio_1.16.0 ? ? ? ? preprocessCore_1.10.0 >> ReadAffy("5500024030700072107989.G03.CEL") > AffyBatch object > size of arrays=744x744 features (16 kb) > cdf=HT_HG-U133A (22277 affyids) > number of samples=1 > number of genes=22277 > annotation=hthgu133a > notes= >> > > > Martin > >> >> loaded via a namespace (and not attached): >> [1] affyio_1.14.0 ? ? ? ?preprocessCore_1.8.0 >>> >> >> >> >> >> >> >> On Fri, Oct 8, 2010 at 10:03 PM, Tim Triche <tim.triche at="" gmail.com=""> wrote: >>> Post the URL from which you are downloading them. >>> Depending on the format, level, etc. the data can be totally different from >>> what Bioconductor would expect. >>> >>> On Fri, Oct 8, 2010 at 9:57 AM, Paul Geeleher <paulgeeleher at="" gmail.com=""> >>> wrote: >>>> >>>> Hi, >>>> >>>> I downloaded some glioblastoma hg-u133a CEL files from The Cancer >>>> Genome Atlas (http://cancergenome.nih.gov/) and ReadAffy() can't seem >>>> to load the files. The CEL files are only 5.3 megs each so maybe they >>>> are compressed or something? And when you open them they look like >>>> binary files? >>>> >>>> I'm wondering if anybody knows what kind of files these are and how I >>>> might read them? ReadAffy() gives the following error: >>>> >>>>> a <- ReadAffy() >>>> Error in value[[3L]](cond) : >>>> ?row.names should specify one of the variables >>>> ?AnnotatedDataFrame 'initialize' could not update varMetadata: >>>> ?perhaps pData and varMetadata are inconsistent? >>>> >>>> >>>> -- >>>> Paul Geeleher >>>> School of Mathematics, Statistics and Applied Mathematics >>>> National University of Ireland >>>> Galway >>>> Ireland >>>> -- >>>> www.bioinformaticstutorials.com >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >>> >>> -- >>> With four parameters I can fit an elephant, and with five I can make him >>> wiggle his trunk. >>> John von Neumann >>> >> >> >> > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 > -- Paul Geeleher School of Mathematics, Statistics and Applied Mathematics National University of Ireland Galway Ireland -- www.bioinformaticstutorials.com
ADD REPLY
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 1 day ago
United States
you certainly don't give many details, like session info or exact download point for your data. but i poked around a bit. perhaps you found a folder broad.mit.edu_GBM.HT_HG-U133A.1.2.0 note that these are HT HG U133A -- we do have some metadata packages for HT arrays, including the cdf file. thus you can parse, and upon evaluating the result, affy will go out and get you the CDF > x = ReadAffy("5500024030700072107989.H10.CEL") > x trying URL 'http://bioconductor.org/packages/2.7/data/annotation/src/c ontrib/hthgu133acdf_2.7.0.tar.gz' Content type 'application/x-gzip' length 1738822 bytes (1.7 Mb) opened URL ================================================== downloaded 1.7 Mb Loading required package: utils BioC_mirror = http://www.bioconductor.org Change using chooseBioCmirror(). * installing *source* package 'hthgu133acdf' ... ** R ** data ** preparing package for lazy loading ** help *** installing help indices ** building package indices ... ** testing if installed package can be loaded * DONE (hthgu133acdf) The downloaded packages are in '/tmp/RtmplV0V1X/downloaded_packages' Updating HTML index of packages in '.Library' AffyBatch object size of arrays=744x744 features (17 kb) cdf=HT_HG-U133A (22277 affyids) number of samples=1 number of genes=22277 annotation=hthgu133a notes= > sessionInfo() R version 2.13.0 Under development (unstable) (2010-09-17 r52943) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] hthgu133acdf_2.7.0 affy_1.27.2 Biobase_2.9.1 loaded via a namespace (and not attached): [1] affyio_1.17.4 preprocessCore_1.11.0 tools_2.13.0 your specific problem needs attention, as the error messages don't seem germane to reading CEL files, and can probably be made more precise. but you have to tell us more about the environment you are getting this error from On Fri, Oct 8, 2010 at 12:57 PM, Paul Geeleher <paulgeeleher at="" gmail.com=""> wrote: > Hi, > > I downloaded some glioblastoma hg-u133a CEL files from The Cancer > Genome Atlas (http://cancergenome.nih.gov/) and ReadAffy() can't seem > to load the files. The CEL files are only 5.3 megs each so maybe they > are compressed or something? And when you open them they look like > binary files? > > I'm wondering if anybody knows what kind of files these are and how I > might read them? ReadAffy() gives the following error: > >> a <- ReadAffy() > Error in value[[3L]](cond) : > ?row.names should specify one of the variables > ?AnnotatedDataFrame 'initialize' could not update varMetadata: > ?perhaps pData and varMetadata are inconsistent? > > > -- > Paul Geeleher > School of Mathematics, Statistics and Applied Mathematics > National University of Ireland > Galway > Ireland > -- > www.bioinformaticstutorials.com > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT

Login before adding your answer.

Traffic: 307 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6