Reading MAGE-ML cdf into bioconductor for limma in R v. 3.0.2
1
0
Entering edit mode
Guest User ★ 12k
@guest-user-4897
Last seen 6.6 years ago
Hi there, I am trying to load some microarray data from ArrayExpress into R for analysis with Limma: pro.fe.set<-ArrayExpress("E-GEOD-26533") However, the probe set needs to be installed first for this to work, and the probe set is in MAGEML format. Previously, I've only ever dealt with the makecdfenv package that uses .cdf files. I found a package called RMAGEML in bioconductor that looked like it would do the job, but it is not available with R v. 3.0.2. I was hoping you might have some insight into how best to approach this problem. Many thanks, Ben -- output of sessionInfo(): R version 3.0.2 (2013-09-25) Platform: x86_64-redhat-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] ArrayExpress_1.22.0 Biobase_2.22.0 BiocGenerics_0.8.0 R.utils_1.29.8 R.oo_1.18.0 R.methodsS3_1.6.1 loaded via a namespace (and not attached): [1] affy_1.40.0 affyio_1.30.0 BiocInstaller_1.12.0 limma_3.18.13 preprocessCore_1.24.0 [6] tools_3.0.2 XML_3.98-1.1 zlibbioc_1.8.0 -- Sent via the guest posting facility at bioconductor.org.
Microarray cdf probe makecdfenv RMAGEML ArrayExpress Microarray cdf probe makecdfenv • 958 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 19 hours ago
United States
Hi Ben, > pro.fe.set<-ArrayExpress("E-GEOD-26533") <snip> > ls() [1] "mapCdfName" "pro.fe.set" > pro.fe.set AffyBatch object size of arrays=448x448 features (51 kb) cdf=MD4-9313a520062 (??? affyids) number of samples=39 Error in getCdfInfo(object) : Could not obtain CDF environment, problems encountered: Specified environment does not contain MD4-9313a520062 Library - package md49313a520062cdf not installed Bioconductor - md49313a520062cdf not available In addition: Warning message: missing cdf environment! in show(AffyBatch) <starts browser=""> Googles MD4-9313a520062 Fourth hit is http://lifesciencedb.jp/geo-e/?division=Unassigned&technology=GeneChip &order=manufacturer&action=ListPlatform First line in table has link http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL5471 At bottom of said link is *Supplementary file* *Size* *Download* *File type/resource* GPL5471.cdf.gz 1.7 Mb (ftp) <ftp: ftp.ncbi.nlm.nih.gov="" geo="" platforms="" gpl5nnn="" gpl5471="" suppl="" gpl547="" 1%2ecdf%2egz="">(http) <http: www.ncbi.nlm.nih.gov="" geo="" download="" ?acc="GPL5471&amp;format=file&amp;fil" e="GPL5471%2Ecdf%2Egz"> CDF Copies http link </closes> > download.file("http://www.ncbi.nlm.nih.gov/geo/download/?acc=GPL5471&f ormat=file&file=GPL5471%2Ecdf%2Egz", "tmp.gz") trying URL 'http://www.ncbi.nlm.nih.gov/geo/download/?acc=GPL5471&format=file&fil e=GPL5471%2Ecdf%2Egz' Content type 'application/octet-stream' length 1764017 bytes (1.7 Mb) opened URL downloaded 1.7 Mb > library(makecdfenv) > make.cdf.package("GPL5471.cdf.gz", "md49313a520062cdf", compress = TRUE, species = "Some_bacterium") ## this may fail. In which case gzip -d GPL5471.cdf.gz and then > make.cdf.package("GPL5471.cdf", "md49313a520062cdf", species = "Some_bacterium") > install.packages("md49313a520062cdf/", repos = NULL, type = "source") > pro.fe.set AffyBatch object size of arrays=448x448 features (51 kb) cdf=MD4-9313a520062 (9947 affyids) number of samples=39 number of genes=9947 annotation=md49313a520062 notes=E-GEOD-26533 E-GEOD-26533 c("Organism", "treatment", "strain", "time", "", "", "", "", "", "", "") c("", "", "", "", "", "", "", "", "", "", "") Best, Jim On 3/31/2014 3:43 AM, Ben Temperton [guest] wrote: > Hi there, > > I am trying to load some microarray data from ArrayExpress into R for analysis with Limma: > > pro.fe.set<-ArrayExpress("E-GEOD-26533") > > However, the probe set needs to be installed first for this to work, and the probe set is in MAGEML format. Previously, I've only ever dealt with the makecdfenv package that uses .cdf files. I found a package called RMAGEML in bioconductor that looked like it would do the job, but it is not available with R v. 3.0.2. > > I was hoping you might have some insight into how best to approach this problem. > > Many thanks, > Ben > > > -- output of sessionInfo(): > > R version 3.0.2 (2013-09-25) > Platform: x86_64-redhat-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 > [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 LC_PAPER=en_GB.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] ArrayExpress_1.22.0 Biobase_2.22.0 BiocGenerics_0.8.0 R.utils_1.29.8 R.oo_1.18.0 R.methodsS3_1.6.1 > > loaded via a namespace (and not attached): > [1] affy_1.40.0 affyio_1.30.0 BiocInstaller_1.12.0 limma_3.18.13 preprocessCore_1.24.0 > [6] tools_3.0.2 XML_3.98-1.1 zlibbioc_1.8.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT

Login before adding your answer.

Traffic: 459 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6