Problem with biomaRt and retrieving annotation for the chip "affy_huex_1_0_st_v2"
1
0
Entering edit mode
@james-w-macdonald-5106
Last seen 47 minutes ago
United States
Hi Nenad, Please don't take things off list. The list archives are intended to be a resource of questions and answers for others to use. I am not sure why you are doing the query that you show. Do you _really_ want to get everything on that chip? That is a huge amount of data, which is not ideal for an interactive web-based query system. In general the idea is to use biomaRt to get annotations for some set of probesets that are interesting for some reason. So for instance, this works: > tst <- getBM("affy_huex_1_0_st_v2", "ensembl_gene_id", "ENSG00000146556", mart) > tst affy_huex_1_0_st_v2 1 3612188 2 2390573 3 2501592 4 3674797 5 3195860 6 2501576 7 3612191 8 3612192 9 3195888 10 2501594 11 3612187 12 2501583 <snip> Best, Jim James W. MacDonald, M.S. Biostatistician Douglas Lab 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 >>> Nenad Bartonicek <nbartonicek at="" gmail.com=""> 03/11/09 9:09 AM >>> Hi Jim, Thank you for a quick reply. I did try with >data=getBM(attributes="affy_huex_1_0_st_v2",value=T, mart=ensembl) Error in postForm(paste(martHost(mart), "?", sep = ""), query = xmlQuery) : transfer closed with outstanding read data remaining and it is the same thing. When I use other arrays it is ok. For example: >data=getBM(attributes="affy_hugene_1_0_st_v1", mart=ensembl) > head(data) affy_hugene_1_0_st_v1 1 8165646 2 8165644 3 8174970 4 7946565 5 7946563 6 8089038 Thanks, Nenad On 11 Mar 2009, at 13:03, James W. MacDonald wrote: > Hi Nenad, > > > > Nenad Bartonicek wrote: >> Hello, >> There seems to be a problem in retrieving annotation from the chip >> "affy_huex_1_0_st_v2". >> >library(biomaRt) >> >ensembl=useMart("ensembl",dataset="hsapiens_gene_ensembl") >> Checking attributes and filters ... ok >> #check if "affy_huex_1_0_st_v2" is a valid attribute >> >attributes=listAttributes(ensembl) >> >grep("affy_huex_1_0_st_v2",attributes[,1]) >> [1] 13 >> #collect data >> >data=getBM(attributes="affy_huex_1_0_st_v2", mart=ensembl) > > You are not asking for any data here. Do you get the same result if > you include some values for the filter and values arguments? > > Best, > > Jim > > >> Error in postForm(paste(martHost(mart), "?", sep = ""), query = >> xmlQuery) : >> transfer closed with outstanding read data remaining >> > sessionInfo() >> R version 2.8.1 (2008-12-22) >> i386-apple-darwin8.11.1 >> locale: >> en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 >> attached base packages: >> [1] stats graphics grDevices datasets tools utils >> methods >> [8] base >> other attached packages: >> [1] biomaRt_1.16.0 R.utils_1.1.1 R.oo_1.4.6 >> R.methodsS3_1.0.3 >> [5] Biobase_2.2.2 >> loaded via a namespace (and not attached): >> [1] RCurl_0.94-1 XML_2.3-0 >> Thank you for your help, >> Nenad >> Nenad Bartonicek >> European Bioinformatics Institute >> Wellcome Trust Genome Campus >> Hinxton >> Cambridge >> CB10 1SD >> United Kingdom >> tel: +44-755-435-9057 >> [[alternative HTML version deleted]] >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Annotation biomaRt Annotation biomaRt • 1.3k views
ADD COMMENT
0
Entering edit mode
@nenad-bartonicek-3293
Last seen 10.2 years ago
Hi Jim, Thank you for your answer. Apologies for not writing in detail what the purpose of my query was. Just a brief reasoning for the query and "bug" report: 1. Aim of the project I am building a local database for all human/mouse/rat microarray probe sets (affy, agilent, illumina) with a connection to their ensembl transcript id and their 3'utr sequences (the query was shortened for the purpose of clarity). The intention is to build the whole database from biomaRt a couple of times per year, but to extract a large amount of data at once when it is done. The purpose of the database is to build an online tool to investigate miRNA signature in expression profiles. 2. Why biomaRt? It is the easiest way, and I can use other bioconductor packages to make building of a diverse database a one script process. Also, even though you stated that biomaRt is not ideal for huge amount of data, it is written in the package description that "the package enables retrieval of large amounts of data in a uniform way" (http://www.bioconductor.org/packages/2.2/bioc/html/biomaRt.html ). I guess the definition of "large" was the main issue. 3. Query size limit This query worked two weeks ago without any problems. I was not aware of any query size limits (to my knowledge not officially stated in any of the biomaRt documents) so I thought I was reporting an error. Especially since the error message does not clearly state that I have crossed a certain query threshold or how much the threshold is, and the script worked for all other arrays. Maybe it would be a good idea to have this limits clearly stated somewhere in the documentation. It should be just one sentence extra, but it could make life easier for both developers and users. Once again, thank you for your time, Best regards, Nenad Nenad Bartonicek European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD United Kingdom tel: +44-755-435-9057 On 11 Mar 2009, at 17:15, James MacDonald wrote: > Hi Nenad, > > Please don't take things off list. The list archives are intended to > be a resource of questions and answers for others to use. > > I am not sure why you are doing the query that you show. Do you > _really_ want to get everything on that chip? That is a huge amount > of data, which is not ideal for an interactive web-based query > system. In general the idea is to use biomaRt to get annotations for > some set of probesets that are interesting for some reason. So for > instance, this works: > >> tst <- getBM("affy_huex_1_0_st_v2", "ensembl_gene_id", >> "ENSG00000146556", mart) >> tst > affy_huex_1_0_st_v2 > 1 3612188 > 2 2390573 > 3 2501592 > 4 3674797 > 5 3195860 > 6 2501576 > 7 3612191 > 8 3612192 > 9 3195888 > 10 2501594 > 11 3612187 > 12 2501583 > <snip> > > Best, > > Jim > > > > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 >>>> Nenad Bartonicek <nbartonicek@gmail.com> 03/11/09 9:09 AM >>> > Hi Jim, > > Thank you for a quick reply. I did try with >> data=getBM(attributes="affy_huex_1_0_st_v2",value=T, mart=ensembl) > Error in postForm(paste(martHost(mart), "?", sep = ""), query = > xmlQuery) : > transfer closed with outstanding read data remaining > > and it is the same thing. > > When I use other arrays it is ok. For example: >> data=getBM(attributes="affy_hugene_1_0_st_v1", mart=ensembl) >> head(data) > affy_hugene_1_0_st_v1 > 1 8165646 > 2 8165644 > 3 8174970 > 4 7946565 > 5 7946563 > 6 8089038 > > Thanks, > > Nenad > On 11 Mar 2009, at 13:03, James W. MacDonald wrote: > >> Hi Nenad, >> >> >> >> Nenad Bartonicek wrote: >>> Hello, >>> There seems to be a problem in retrieving annotation from the chip >>> "affy_huex_1_0_st_v2". >>>> library(biomaRt) >>>> ensembl=useMart("ensembl",dataset="hsapiens_gene_ensembl") >>> Checking attributes and filters ... ok >>> #check if "affy_huex_1_0_st_v2" is a valid attribute >>>> attributes=listAttributes(ensembl) >>>> grep("affy_huex_1_0_st_v2",attributes[,1]) >>> [1] 13 >>> #collect data >>>> data=getBM(attributes="affy_huex_1_0_st_v2", mart=ensembl) >> >> You are not asking for any data here. Do you get the same result if >> you include some values for the filter and values arguments? >> >> Best, >> >> Jim >> >> >>> Error in postForm(paste(martHost(mart), "?", sep = ""), query = >>> xmlQuery) : >>> transfer closed with outstanding read data remaining >>>> sessionInfo() >>> R version 2.8.1 (2008-12-22) >>> i386-apple-darwin8.11.1 >>> locale: >>> en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 >>> attached base packages: >>> [1] stats graphics grDevices datasets tools utils >>> methods >>> [8] base >>> other attached packages: >>> [1] biomaRt_1.16.0 R.utils_1.1.1 R.oo_1.4.6 >>> R.methodsS3_1.0.3 >>> [5] Biobase_2.2.2 >>> loaded via a namespace (and not attached): >>> [1] RCurl_0.94-1 XML_2.3-0 >>> Thank you for your help, >>> Nenad >>> Nenad Bartonicek >>> European Bioinformatics Institute >>> Wellcome Trust Genome Campus >>> Hinxton >>> Cambridge >>> CB10 1SD >>> United Kingdom >>> tel: +44-755-435-9057 >>> [[alternative HTML version deleted]] >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> Douglas Lab >> 5912 Buhl >> 1241 E. Catherine St. >> Ann Arbor MI 48109-5618 >> 734-615-7826 > > > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should > not be used for urgent or sensitive issues [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 886 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6