Problem importing ArrayExpress data for custom array into BioConductor
2
0
Entering edit mode
Aatish ▴ 10
@aatish-4308
Last seen 9.6 years ago
Hi, I've been spending a lot of time trying to do something quite elementary and have been getting thoroughly stuck, so apologies in advance for the basic question. I've been trying to build an R object from CEL data from the ArrayExpress database. I used the ArrayExpress package for BioConductor to import the data. Following the vignette, I ran: library("ArrayExpress") rawset = ArrayExpress("E-TABM-14") which downloaded all the data for this experiment. After downloading I get the message: Read 48 items The object containing experiment E-TABM-14 has been built. However, if I try to access this data structure (which should be an AffyBatch file), I get the following error message: rawset Warning: unable to access index for repository http://brainarray.mbni.med.umich.edu/bioc/bin/macosx/leopard/contrib/2 .11 AffyBatch object size of arrays=2560x2560 features (32 kb) cdf=Scervisiae_tiling (??? affyids) number of samples=8 Warning: unable to access index for repository http://brainarray.mbni.med.umich.edu/bioc/bin/macosx/leopard/contrib/2 .11 Error in getCdfInfo(object) : Could not obtain CDF environment, problems encountered: Specified environment does not contain Scervisiae_tiling Library - package scervisiaetilingcdf not installed Bioconductor - scervisiaetilingcdf not available In addition: Warning message: missing cdf environment! in show(AffyBatch) As far as I know, there is no standard package called scervisiaetilingcdf. I assume that I am missing a CDF file that is for the customized Affymetrix S. Cerevisiae Tiling Microarray that was used to generate the data. However, I don't know where I can obtain this CDF file from or how I can create it. I would be very appreciative if someone could point me in the right direction. Thanks for your help, Aatish Bhatia PhD. student, Rutgers University P.S. Here are the contents of: traceback() 7: stop(paste("Could not obtain CDF environment, problems encountered:", paste(unlist(badOut), collapse = "\n"), sep = "\n")) 6: getCdfInfo(object) 5: featureNames(object) 4: featureNames(object) 3: cat("number of genes=", length(featureNames(object)), "\n", sep = "") 2: function (object) standardGeneric("show")(<s4 object="" of="" class="" "affybatch"="">) 1: function (object) standardGeneric("show")(<s4 object="" of="" class="" "affybatch"="">) and sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] limma_3.4.5 affy_1.26.1 ArrayExpress_1.8.0 Biobase_2.8.0 loaded via a namespace (and not attached): [1] affyio_1.16.0 preprocessCore_1.10.0 tools_2.11.1 [4] XML_3.2-0
Microarray cdf ArrayExpress Microarray cdf ArrayExpress • 1.3k views
ADD COMMENT
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.3 years ago
United States
Hi Aatish, It looks like the brainarray site does not have mac binaries available. >From your browser visit, http://brainarray.mbni.med.umich.edu/bioc/bin/ Here you will see that brainarray currently has no macosx directory . One workaround for this is to use the source tarball. Look in http://brainarray.mbni.med.umich.edu/bioc/src/contrib/ for the appropriate cdf package (ie, maybe one of tilingscerevisiae10scense* ). Then from your R session try biocLite("tilingscerevisiae10rscensecdf", contriburl="http://brainarray.mbni.med.umich.edu/bioc/src/contrib/", type="source") Alternatively you could download the tarball and build the package manually. Having your own copy of the tarball might be a good idea if you want guaranteed access to this cdf package for sometime. Valerie On 10/19/2010 11:15 PM, Aatish wrote: > Hi, > > I've been spending a lot of time trying to do something quite elementary and > have been getting thoroughly stuck, so apologies in advance for the basic > question. > > I've been trying to build an R object from CEL data from the ArrayExpress > database. I used the ArrayExpress package for BioConductor to import the data. > > Following the vignette, I ran: > > library("ArrayExpress") > rawset = ArrayExpress("E-TABM-14") > > which downloaded all the data for this experiment. After downloading I get the > message: > > Read 48 items > The object containing experiment E-TABM-14 has been built. > > However, if I try to access this data structure (which should be an AffyBatch > file), I get the following error message: > > rawset > > Warning: unable to access index for repository > http://brainarray.mbni.med.umich.edu/bioc/bin/macosx/leopard/contrib /2.11 > AffyBatch object > size of arrays=2560x2560 features (32 kb) > cdf=Scervisiae_tiling (??? affyids) > number of samples=8 > Warning: unable to access index for repository > http://brainarray.mbni.med.umich.edu/bioc/bin/macosx/leopard/contrib /2.11 > Error in getCdfInfo(object) : > Could not obtain CDF environment, problems encountered: > Specified environment does not contain Scervisiae_tiling > Library - package scervisiaetilingcdf not installed > Bioconductor - scervisiaetilingcdf not available > In addition: Warning message: > missing cdf environment! in show(AffyBatch) > > As far as I know, there is no standard package called scervisiaetilingcdf. I > assume that I am missing a CDF file that is for the customized Affymetrix S. > Cerevisiae Tiling Microarray that was used to generate the data. However, I > don't know where I can obtain this CDF file from or how I can create it. I would > be very appreciative if someone could point me in the right direction. > > Thanks for your help, > Aatish Bhatia > PhD. student, Rutgers University > > P.S. Here are the contents of: > > traceback() > 7: stop(paste("Could not obtain CDF environment, problems encountered:", > paste(unlist(badOut), collapse = "\n"), sep = "\n")) > 6: getCdfInfo(object) > 5: featureNames(object) > 4: featureNames(object) > 3: cat("number of genes=", length(featureNames(object)), "\n", sep = "") > 2: function (object) > standardGeneric("show")(<s4 object="" of="" class="" "affybatch"="">) > 1: function (object) > standardGeneric("show")(<s4 object="" of="" class="" "affybatch"="">) > > and > > sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] limma_3.4.5 affy_1.26.1 ArrayExpress_1.8.0 Biobase_2.8.0 > > loaded via a namespace (and not attached): > [1] affyio_1.16.0 preprocessCore_1.10.0 tools_2.11.1 > [4] XML_3.2-0 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 11 days ago
EMBL European Molecular Biology Laborat…
Hi Aatish short answer: use the 'davidTiling' package. long answer: the AffyBatch class is designed for Affymetrix probe-set oriented arrays (i.e. arrays with a certain number of probes per gene); E-TABM-14 uses a custom-designed tiling array, for which I am not aware of a CDF (one could be made, but I am not sure it would be useful). One could argue that the 'ArrayExpress' package/function should not even allow you to load the E-TABM-14 dataset, from this specialised platform, if it is not able to produce a fully valid Bioconductor data object out of it; instead, it seems that it opts to provide you with a half-complete, somewhat useful data object, which you could use for some things, but which will break as soon as it tries to find its CDF annotation package (as happens to you). Furthermore, see also: http://www.ebi.ac.uk/huber-srv/David2006 and http://steinmetzlab.embl.de/NFRsharing --> E-TABM-590 Hope this helps Wolfgang Il Oct/20/10 8:15 AM, Aatish ha scritto: > Hi, > > I've been spending a lot of time trying to do something quite elementary and > have been getting thoroughly stuck, so apologies in advance for the basic > question. > > I've been trying to build an R object from CEL data from the ArrayExpress > database. I used the ArrayExpress package for BioConductor to import the data. > > Following the vignette, I ran: > > library("ArrayExpress") > rawset = ArrayExpress("E-TABM-14") > > which downloaded all the data for this experiment. After downloading I get the > message: > > Read 48 items > The object containing experiment E-TABM-14 has been built. > > However, if I try to access this data structure (which should be an AffyBatch > file), I get the following error message: > > rawset > > Warning: unable to access index for repository > http://brainarray.mbni.med.umich.edu/bioc/bin/macosx/leopard/contrib /2.11 > AffyBatch object > size of arrays=2560x2560 features (32 kb) > cdf=Scervisiae_tiling (??? affyids) > number of samples=8 > Warning: unable to access index for repository > http://brainarray.mbni.med.umich.edu/bioc/bin/macosx/leopard/contrib /2.11 > Error in getCdfInfo(object) : > Could not obtain CDF environment, problems encountered: > Specified environment does not contain Scervisiae_tiling > Library - package scervisiaetilingcdf not installed > Bioconductor - scervisiaetilingcdf not available > In addition: Warning message: > missing cdf environment! in show(AffyBatch) > > As far as I know, there is no standard package called scervisiaetilingcdf. I > assume that I am missing a CDF file that is for the customized Affymetrix S. > Cerevisiae Tiling Microarray that was used to generate the data. However, I > don't know where I can obtain this CDF file from or how I can create it. I would > be very appreciative if someone could point me in the right direction. > > Thanks for your help, > Aatish Bhatia > PhD. student, Rutgers University > > P.S. Here are the contents of: > > traceback() > 7: stop(paste("Could not obtain CDF environment, problems encountered:", > paste(unlist(badOut), collapse = "\n"), sep = "\n")) > 6: getCdfInfo(object) > 5: featureNames(object) > 4: featureNames(object) > 3: cat("number of genes=", length(featureNames(object)), "\n", sep = "") > 2: function (object) > standardGeneric("show")(<s4 object="" of="" class="" "affybatch"="">) > 1: function (object) > standardGeneric("show")(<s4 object="" of="" class="" "affybatch"="">) > > and > > sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] limma_3.4.5 affy_1.26.1 ArrayExpress_1.8.0 Biobase_2.8.0 > > loaded via a namespace (and not attached): > [1] affyio_1.16.0 preprocessCore_1.10.0 tools_2.11.1 > [4] XML_3.2-0 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 843 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6