Limit for chromosomal regions in biomaRt?
1
0
Entering edit mode
@christian-ruckert-3294
Last seen 5.4 years ago
Germany
I have a character vector of approximately 25000 genomic regions I want to retrieve additional genomic information for using biomaRt R-package: library("biomaRt") mart = useMart("ensembl", dataset = "hsapiens_gene_ensembl") regions = c( "1:661517:668171", "1:787463:794591" ...) attribs = c("chromosome_name", "start_position", "end_position", "strand", "ensembl_gene_id", "hgnc_symbol") result = getBM(filters="chromosomal_region", values=regions, attributes=attribs, mart=mart) The above query seems to run forever (3 days +), but when splitting the regions vector in two halves each is finished after approximately 15 minutes. Am I doing something wrong or is there an upper limit for the number of regions in a biomart query? Thanks in advance, Christian > sessionInfo() R version 2.11.0 (2010-04-22) x86_64-pc-linux-gnu locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] biomaRt_2.4.0 IRanges_1.6.8 affy_1.26.1 Biobase_2.8.0 loaded via a namespace (and not attached): [1] RCurl_1.4-2 XML_3.1-0 affyio_1.16.0 [4] preprocessCore_1.10.0
biomaRt biomaRt • 2.2k views
ADD COMMENT
0
Entering edit mode
Steffen ▴ 500
@steffen-2351
Last seen 10.2 years ago
Hi Christian, Technically there is no limit. But for queries where I expect a lot of data to be returned, I usually split the query in a few smaller ones. However, for your query I would not expect that huge amount of data to be returned. This is not a biomaRt issue but an Ensembl issue and you might want to ask your question directly at helpdesk@ensembl.org so they can check why this takes so long on their servers. Whenever contacting the ensembl helpdesk, mention that you're using biomaRt and give them the XML version of your query which you can get by setting verbose = TRUE in the getBM function as in: > result = getBM(filters="chromosomal_region", values=regions, + attributes=attribs, mart=mart,verbose=TRUE) <query virtualschemaname="default" uniquerows="1" count="0" datasetconfigversion="0.6" requestid="biomaRt"> <dataset name="hsapiens_gene_ensembl"><attribute name="chromosome_name"/><attribute name="start_position"/><attribute name="end_position"/><attribute name="strand"/><attribute name="ensembl_gene_id"/><attribute name="hgnc_symbol"/><filter name="chromosomal_region" value="1:661517:668171,1:787463:794591"/></dataset></query> Cheers, Steffen On Fri, Jul 30, 2010 at 3:06 AM, Christian Ruckert <cruckert@uni- muenster.de=""> wrote: > I have a character vector of approximately 25000 genomic regions I want to > retrieve additional genomic information for using biomaRt R-package: > > library("biomaRt") > mart = useMart("ensembl", dataset = "hsapiens_gene_ensembl") > > regions = c( > "1:661517:668171", > "1:787463:794591" > ...) > > attribs = c("chromosome_name", > "start_position", > "end_position", > "strand", > "ensembl_gene_id", > "hgnc_symbol") > > result = getBM(filters="chromosomal_region", values=regions, > attributes=attribs, mart=mart) > > The above query seems to run forever (3 days +), but when splitting the > regions vector in two halves each is finished after approximately 15 > minutes. > > Am I doing something wrong or is there an upper limit for the number of > regions in a biomart query? > > Thanks in advance, > Christian > > > > sessionInfo() > R version 2.11.0 (2010-04-22) > x86_64-pc-linux-gnu > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] biomaRt_2.4.0 IRanges_1.6.8 affy_1.26.1 Biobase_2.8.0 > > loaded via a namespace (and not attached): > [1] RCurl_1.4-2 XML_3.1-0 affyio_1.16.0 > [4] preprocessCore_1.10.0 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 699 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6