downloading different kinds of microarray data
1
0
Entering edit mode
@alex-levitchi-4179
Last seen 9.6 years ago
Dear Bioconductors, I am working on the development of a tool which use to download microarray data and then make the connection to Bioconductor annotation packages. My specific answer is about the way to manage downloading different kinds of microarrays, which can be: - GSE - several GSMs - users data (excel or tab delimiter file). I use GEOquery package. My tool works fine if I am using just GSE file, which has a good structure and I know how to extract expression values, platform (GPL) and samples names. > gse=getGEO(idata,GSEMatrix=TRUE) >columns=c('title','type','source_name_ch1','platform_id') >pdata=pData(gse[[1]])[,columns] >expression=exprs(gse[[1]]) >colnames(expression)=as.vector(pdata[colnames(expression),3]) But I feel confused, when I think about the way to handle with several GSMs or user data. applying getGEO function for GSM I have to use then Table(gse)$VALUE to extract expression values and Meta(gse)$platform_id to know the GPL. I understand how to do this easy when I have just 1 GSM. How should I manage several GSMs? from the start I supposed to use smth like this: >gse=do.call("cbind",lapply('list_of_GSMs'),function(x) { >getGEO(as.character(x),GSEMatrix=TRUE) >} but, thus, I get just expression values matrix, and I still don't know what is the GPL and sample names. Another idea (I did not check it yet, as I am not sure it is correct) is to try to create an ExpressionSet (also for user data, after downloading them through 'read.table'), but I also don't know how to create a phenoData file, simply manually or there is a possibility to make it through the code. having ExpressionSet I suppose I will can to use "pData" function like in case of a GSE. Doing all this I would like to be able to download and arrange the data in the way, to use the rest of the functions which comes after 'gse=....' in the up presented example. Please, give me some hints at least at one of this points. Thank's for you nice job. Cheers Alexei Levitchi PhD in Genetics, Bioinformatician at Laboratory of Bioinformatics CBM, Area Science Park, Trieste, Italy http://www.cbm.fvg.it/laboratories/bioinformatics_research scientific researcher, Center of Molecular Biology, University of Academy of Sciences of Moldova www.edu.asm.md [[alternative HTML version deleted]]
Genetics GEOquery Genetics GEOquery • 1.3k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
Hi, Alex. You are definitely thinking correctly that you want to be using ExpressionSets. I would focus your attention on learning to construct an ExpressionSet for each case you outline. Sean On Jul 23, 2010 10:12 AM, "Alex Levitchi" <alex.levitchi@cbm.fvg.it> wrote: Dear Bioconductors, I am working on the development of a tool which use to download microarray data and then make the connection to Bioconductor annotation packages. My specific answer is about the way to manage downloading different kinds of microarrays, which can be: - GSE - several GSMs - users data (excel or tab delimiter file). I use GEOquery package. My tool works fine if I am using just GSE file, which has a good structure and I know how to extract expression values, platform (GPL) and samples names. > gse=getGEO(idata,GSEMatrix=TRUE) >columns=c('title','type','source_name_ch1','platform_id') >pdata=pData(gse[[1]])[,columns] >expression=exprs(gse[[1]]) >colnames(expression)=as.vector(pdata[colnames(expression),3]) But I feel confused, when I think about the way to handle with several GSMs or user data. applying getGEO function for GSM I have to use then Table(gse)$VALUE to extract expression values and Meta(gse)$platform_id to know the GPL. I understand how to do this easy when I have just 1 GSM. How should I manage several GSMs? from the start I supposed to use smth like this: >gse=do.call("cbind",lapply('list_of_GSMs'),function(x) { >getGEO(as.character(x),GSEMatrix=TRUE) >} but, thus, I get just expression values matrix, and I still don't know what is the GPL and sample names. Another idea (I did not check it yet, as I am not sure it is correct) is to try to create an ExpressionSet (also for user data, after downloading them through 'read.table'), but I also don't know how to create a phenoData file, simply manually or there is a possibility to make it through the code. having ExpressionSet I suppose I will can to use "pData" function like in case of a GSE. Doing all this I would like to be able to download and arrange the data in the way, to use the rest of the functions which comes after 'gse=....' in the up presented example. Please, give me some hints at least at one of this points. Thank's for you nice job. Cheers Alexei Levitchi PhD in Genetics, Bioinformatician at Laboratory of Bioinformatics CBM, Area Science Park, Trieste, Italy http://www.cbm.fvg.it/laboratories/bioinformatics_research scientific researcher, Center of Molecular Biology, University of Academy of Sciences of Moldova www.edu.asm.md [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 939 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6