Question: downloading different kinds of microarray data
0
gravatar for Alex Levitchi
8.8 years ago by
Alex Levitchi30 wrote:
Dear Bioconductors, I am working on the development of a tool which use to download microarray data and then make the connection to Bioconductor annotation packages. My specific answer is about the way to manage downloading different kinds of microarrays, which can be: - GSE - several GSMs - users data (excel or tab delimiter file). I use GEOquery package. My tool works fine if I am using just GSE file, which has a good structure and I know how to extract expression values, platform (GPL) and samples names. > gse=getGEO(idata,GSEMatrix=TRUE) >columns=c('title','type','source_name_ch1','platform_id') >pdata=pData(gse[[1]])[,columns] >expression=exprs(gse[[1]]) >colnames(expression)=as.vector(pdata[colnames(expression),3]) But I feel confused, when I think about the way to handle with several GSMs or user data. applying getGEO function for GSM I have to use then Table(gse)$VALUE to extract expression values and Meta(gse)$platform_id to know the GPL. I understand how to do this easy when I have just 1 GSM. How should I manage several GSMs? from the start I supposed to use smth like this: >gse=do.call("cbind",lapply('list_of_GSMs'),function(x) { >getGEO(as.character(x),GSEMatrix=TRUE) >} but, thus, I get just expression values matrix, and I still don't know what is the GPL and sample names. Another idea (I did not check it yet, as I am not sure it is correct) is to try to create an ExpressionSet (also for user data, after downloading them through 'read.table'), but I also don't know how to create a phenoData file, simply manually or there is a possibility to make it through the code. having ExpressionSet I suppose I will can to use "pData" function like in case of a GSE. Doing all this I would like to be able to download and arrange the data in the way, to use the rest of the functions which comes after 'gse=....' in the up presented example. Please, give me some hints at least at one of this points. Thank's for you nice job. Cheers Alexei Levitchi PhD in Genetics, Bioinformatician at Laboratory of Bioinformatics CBM, Area Science Park, Trieste, Italy http://www.cbm.fvg.it/laboratories/bioinformatics_research scientific researcher, Center of Molecular Biology, University of Academy of Sciences of Moldova www.edu.asm.md [[alternative HTML version deleted]]
genetics geoquery • 678 views
ADD COMMENTlink modified 8.8 years ago by Sean Davis21k • written 8.8 years ago by Alex Levitchi30
Answer: downloading different kinds of microarray data
0
gravatar for Sean Davis
8.8 years ago by
Sean Davis21k
United States
Sean Davis21k wrote:
Hi, Alex. You are definitely thinking correctly that you want to be using ExpressionSets. I would focus your attention on learning to construct an ExpressionSet for each case you outline. Sean On Jul 23, 2010 10:12 AM, "Alex Levitchi" <alex.levitchi@cbm.fvg.it> wrote: Dear Bioconductors, I am working on the development of a tool which use to download microarray data and then make the connection to Bioconductor annotation packages. My specific answer is about the way to manage downloading different kinds of microarrays, which can be: - GSE - several GSMs - users data (excel or tab delimiter file). I use GEOquery package. My tool works fine if I am using just GSE file, which has a good structure and I know how to extract expression values, platform (GPL) and samples names. > gse=getGEO(idata,GSEMatrix=TRUE) >columns=c('title','type','source_name_ch1','platform_id') >pdata=pData(gse[[1]])[,columns] >expression=exprs(gse[[1]]) >colnames(expression)=as.vector(pdata[colnames(expression),3]) But I feel confused, when I think about the way to handle with several GSMs or user data. applying getGEO function for GSM I have to use then Table(gse)$VALUE to extract expression values and Meta(gse)$platform_id to know the GPL. I understand how to do this easy when I have just 1 GSM. How should I manage several GSMs? from the start I supposed to use smth like this: >gse=do.call("cbind",lapply('list_of_GSMs'),function(x) { >getGEO(as.character(x),GSEMatrix=TRUE) >} but, thus, I get just expression values matrix, and I still don't know what is the GPL and sample names. Another idea (I did not check it yet, as I am not sure it is correct) is to try to create an ExpressionSet (also for user data, after downloading them through 'read.table'), but I also don't know how to create a phenoData file, simply manually or there is a possibility to make it through the code. having ExpressionSet I suppose I will can to use "pData" function like in case of a GSE. Doing all this I would like to be able to download and arrange the data in the way, to use the rest of the functions which comes after 'gse=....' in the up presented example. Please, give me some hints at least at one of this points. Thank's for you nice job. Cheers Alexei Levitchi PhD in Genetics, Bioinformatician at Laboratory of Bioinformatics CBM, Area Science Park, Trieste, Italy http://www.cbm.fvg.it/laboratories/bioinformatics_research scientific researcher, Center of Molecular Biology, University of Academy of Sciences of Moldova www.edu.asm.md [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD COMMENTlink written 8.8 years ago by Sean Davis21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 312 users visited in the last hour