too many biomaRt connections
2
0
Entering edit mode
@elizabeth-purdom-2486
Last seen 3.9 years ago
USA/ Berkeley/UC Berkeley
Hi, I am using biomaRt to get information regarding genes. I use it pretty frequently and recently have gotten the error: Too many connections at /ebi/www/biomart/www/biomart- perl-06/lib/BioMart/Configuration/DBLocation.pm line 98 I assume that I've hit some sort of wall in terms of how often I have queried the database? I don't really use biomart except through R; at what point do you get booted off and what can I do to regain access? I often run queries on a few hundred genes and don't think twice about rerunning such a query or running several such queries in a day plus I use functions that call on biomaRt repeatedly that I also apply to around 100 genes. So I could easily send a thousand queries in a day. I can be more careful, but it would be useful to know what the limits are. And does it matter how many times you call a 'mart<-useMart(...)' command? (lately, I've been calling it frequently rather than using the one I've already opened, largely through programming laziness). By the way, it took me quite some time to track down the error, because I was using getGene which just gave me the confusing error: "Error: ncol(result) == length(attributes) is not TRUE" I think this must be because something like try(...) is used within getBM() and so the output is the error message which is then transferred down the line and at some point causes a problem when the function tries to bundle it into a data.frame, etc. Thanks, Elizabeth
biomaRt biomaRt • 1.9k views
ADD COMMENT
0
Entering edit mode
@kasper-daniel-hansen-2979
Last seen 2.3 years ago
United States
Steffen will know more about this, but it is well known that when you access the Mart servers that you should collect all your queries into one big query, so not do something like for ( g in genes) getInfo(g) but instead do something like getInfo(genes) So try to collect everything into a few big queries, instead of doing "thousands of queries" in a day. You also have the option of downloading the entire database and access it locally. That way there is no limit, but it requires some work It is not uncommon for these large databases to have some usage limits. Kasper On Apr 9, 2008, at 12:09 PM, Elizabeth Purdom wrote: > Hi, > I am using biomaRt to get information regarding genes. I use it pretty > frequently and recently have gotten the error: > > Too many connections at > /ebi/www/biomart/www/biomart-perl-06/lib/BioMart/Configuration/ > DBLocation.pm > line 98 > > I assume that I've hit some sort of wall in terms of how often I have > queried the database? > > I don't really use biomart except through R; at what point do you get > booted off and what can I do to regain access? I often run queries > on a > few hundred genes and don't think twice about rerunning such a query > or > running several such queries in a day plus I use functions that call > on > biomaRt repeatedly that I also apply to around 100 genes. So I could > easily send a thousand queries in a day. I can be more careful, but it > would be useful to know what the limits are. And does it matter how > many > times you call a 'mart<-useMart(...)' command? (lately, I've been > calling it frequently rather than using the one I've already opened, > largely through programming laziness). > > By the way, it took me quite some time to track down the error, > because > I was using getGene which just gave me the confusing error: > "Error: ncol(result) == length(attributes) is not TRUE" > I think this must be because something like try(...) is used within > getBM() and so the output is the error message which is then > transferred > down the line and at some point causes a problem when the function > tries > to bundle it into a data.frame, etc. > > Thanks, > Elizabeth > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Hi Kasper, That's correct. biomaRt is made to do batch queries and users should try to avoid using it in loops. Cheers, Steffen Kasper Daniel Hansen wrote: > Steffen will know more about this, but it is well known that when you > access the Mart servers that you should collect all your queries into > one big query, so not do something like > for ( g in genes) > getInfo(g) > but instead do something like > getInfo(genes) > > So try to collect everything into a few big queries, instead of doing > "thousands of queries" in a day. > > You also have the option of downloading the entire database and access > it locally. That way there is no limit, but it requires some work > > It is not uncommon for these large databases to have some usage limits. > > Kasper > > On Apr 9, 2008, at 12:09 PM, Elizabeth Purdom wrote: > > >> Hi, >> I am using biomaRt to get information regarding genes. I use it pretty >> frequently and recently have gotten the error: >> >> Too many connections at >> /ebi/www/biomart/www/biomart-perl-06/lib/BioMart/Configuration/ >> DBLocation.pm >> line 98 >> >> I assume that I've hit some sort of wall in terms of how often I have >> queried the database? >> >> I don't really use biomart except through R; at what point do you get >> booted off and what can I do to regain access? I often run queries >> on a >> few hundred genes and don't think twice about rerunning such a query >> or >> running several such queries in a day plus I use functions that call >> on >> biomaRt repeatedly that I also apply to around 100 genes. So I could >> easily send a thousand queries in a day. I can be more careful, but it >> would be useful to know what the limits are. And does it matter how >> many >> times you call a 'mart<-useMart(...)' command? (lately, I've been >> calling it frequently rather than using the one I've already opened, >> largely through programming laziness). >> >> By the way, it took me quite some time to track down the error, >> because >> I was using getGene which just gave me the confusing error: >> "Error: ncol(result) == length(attributes) is not TRUE" >> I think this must be because something like try(...) is used within >> getBM() and so the output is the error message which is then >> transferred >> down the line and at some point causes a problem when the function >> tries >> to bundle it into a data.frame, etc. >> >> Thanks, >> Elizabeth >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY
0
Entering edit mode
I do generally collect into one big query (for speed if nothing else). Except that I'll have one set of genes and then do another analysis and have another set, etc. so I wind up querying several times. If it were just that, the number of such (overall) queries is minimal. My problem is probably that I'm using a package (also by Steffen) and it calls biomart individually -- actually a couple of times per gene -- and I run this on several hundred genes a day sometimes. So clearly I need to start being more careful about this. So I guess my question more was (in addition to confirmation that is the problem), can anyone tell me what are the limits or where to find them so I can watch how many times I do these graph commands. Thanks, Elizabeth Kasper Daniel Hansen wrote: > Steffen will know more about this, but it is well known that when you > access the Mart servers that you should collect all your queries into > one big query, so not do something like > for ( g in genes) > getInfo(g) > but instead do something like > getInfo(genes) > > So try to collect everything into a few big queries, instead of doing > "thousands of queries" in a day. > > You also have the option of downloading the entire database and access > it locally. That way there is no limit, but it requires some work > > It is not uncommon for these large databases to have some usage limits. > > Kasper > > On Apr 9, 2008, at 12:09 PM, Elizabeth Purdom wrote: > >> Hi, >> I am using biomaRt to get information regarding genes. I use it pretty >> frequently and recently have gotten the error: >> >> Too many connections at >> /ebi/www/biomart/www/biomart- perl-06/lib/BioMart/Configuration/DBLocation.pm >> >> line 98 >> >> I assume that I've hit some sort of wall in terms of how often I have >> queried the database? >> >> I don't really use biomart except through R; at what point do you get >> booted off and what can I do to regain access? I often run queries on a >> few hundred genes and don't think twice about rerunning such a query or >> running several such queries in a day plus I use functions that call on >> biomaRt repeatedly that I also apply to around 100 genes. So I could >> easily send a thousand queries in a day. I can be more careful, but it >> would be useful to know what the limits are. And does it matter how many >> times you call a 'mart<-useMart(...)' command? (lately, I've been >> calling it frequently rather than using the one I've already opened, >> largely through programming laziness). >> >> By the way, it took me quite some time to track down the error, because >> I was using getGene which just gave me the confusing error: >> "Error: ncol(result) == length(attributes) is not TRUE" >> I think this must be because something like try(...) is used within >> getBM() and so the output is the error message which is then transferred >> down the line and at some point causes a problem when the function tries >> to bundle it into a data.frame, etc. >> >> Thanks, >> Elizabeth >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Steffen ▴ 500
@steffen-2351
Last seen 11.1 years ago
Hi Elizabeth, This is an error I've recently encountered as well and has as far as I know nothing to do with the number of queries you performed but is a more general problem of the BioMart version of Ensembl. biomaRt is not the only tool querying this web service. Other tools like Taverna and custom tools using e.g. the BioMart perl API also query this web service. It looks like the BioMart system has gained popularity and recently we frequently hit the maximum number of connections to the MySQL database. Note that biomaRt doesn't query directly via MySQL but it's the MySQL connection at the web service that is unable to connect. I've contacted the Ensembl and BioMart teams and they are investigating this problem. It would help reporting this to helpdesk at ensembl.org to pressure them on this issue, showing it's not only me but many people who can't use their web service recently. Report this part of the error message: Too many connections at /ebi/www/biomart/www/biomart- perl-06/lib/BioMart/Configuration/DBLocation.pm line 98 Cheers, Steffen Elizabeth Purdom wrote: > Hi, > I am using biomaRt to get information regarding genes. I use it pretty > frequently and recently have gotten the error: > > Too many connections at > /ebi/www/biomart/www/biomart- perl-06/lib/BioMart/Configuration/DBLocation.pm > line 98 > > I assume that I've hit some sort of wall in terms of how often I have > queried the database? > > I don't really use biomart except through R; at what point do you get > booted off and what can I do to regain access? I often run queries on a > few hundred genes and don't think twice about rerunning such a query or > running several such queries in a day plus I use functions that call on > biomaRt repeatedly that I also apply to around 100 genes. So I could > easily send a thousand queries in a day. I can be more careful, but it > would be useful to know what the limits are. And does it matter how many > times you call a 'mart<-useMart(...)' command? (lately, I've been > calling it frequently rather than using the one I've already opened, > largely through programming laziness). > > By the way, it took me quite some time to track down the error, because > I was using getGene which just gave me the confusing error: > "Error: ncol(result) == length(attributes) is not TRUE" > I think this must be because something like try(...) is used within > getBM() and so the output is the error message which is then transferred > down the line and at some point causes a problem when the function tries > to bundle it into a data.frame, etc. > > Thanks, > Elizabeth > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- ---------------------------------------------------------------- Steffen Durinck, PhD Division of Biostatistics, University of California, Berkeley & Life Sciences Department, Lawrence Berkeley National Laboratory 1 cyclotron Rd, Berkeley CA, 94720, USA Tel: +1-510-486-5202
ADD COMMENT
0
Entering edit mode
On Wed, Apr 9, 2008 at 3:32 PM, Steffen <sdurinck at="" lbl.gov=""> wrote: > Hi Elizabeth, > > This is an error I've recently encountered as well and has as far as I > know nothing to do with the number of queries you performed but is a > more general problem of the BioMart version of Ensembl. > biomaRt is not the only tool querying this web service. Other tools > like Taverna and custom tools using e.g. the BioMart perl API also query > this web service. > It looks like the BioMart system has gained popularity and recently we > frequently hit the maximum number of connections to the MySQL database. > Note that biomaRt doesn't query directly via MySQL but it's the MySQL > connection at the web service that is unable to connect. > > I've contacted the Ensembl and BioMart teams and they are investigating > this problem. It would help reporting this to helpdesk at ensembl.org to > pressure them on this issue, showing it's not only me but many people > who can't use their web service recently. Report this part of the error > message: > > > Too many connections at > /ebi/www/biomart/www/biomart- perl-06/lib/BioMart/Configuration/DBLocation.pm > line 98 > > > Cheers, > Steffen Just to finish the thought here, it is possible to install the MySQL database locally and use biomaRt locally. I think this will solve the problem for those who need regular, uninterrupted access. Steffen can correct me if I'm wrong on this. Also, installing is not for the faint-of-heart, as the database is pretty big. Sean > > > Elizabeth Purdom wrote: > > Hi, > > I am using biomaRt to get information regarding genes. I use it pretty > > frequently and recently have gotten the error: > > > > Too many connections at > > /ebi/www/biomart/www/biomart- perl-06/lib/BioMart/Configuration/DBLocation.pm > > line 98 > > > > I assume that I've hit some sort of wall in terms of how often I have > > queried the database? > > > > I don't really use biomart except through R; at what point do you get > > booted off and what can I do to regain access? I often run queries on a > > few hundred genes and don't think twice about rerunning such a query or > > running several such queries in a day plus I use functions that call on > > biomaRt repeatedly that I also apply to around 100 genes. So I could > > easily send a thousand queries in a day. I can be more careful, but it > > would be useful to know what the limits are. And does it matter how many > > times you call a 'mart<-useMart(...)' command? (lately, I've been > > calling it frequently rather than using the one I've already opened, > > largely through programming laziness). > > > > By the way, it took me quite some time to track down the error, because > > I was using getGene which just gave me the confusing error: > > "Error: ncol(result) == length(attributes) is not TRUE" > > I think this must be because something like try(...) is used within > > getBM() and so the output is the error message which is then transferred > > down the line and at some point causes a problem when the function tries > > to bundle it into a data.frame, etc. > > > > Thanks, > > Elizabeth > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > -- > ---------------------------------------------------------------- > Steffen Durinck, PhD > > Division of Biostatistics, University of California, Berkeley & > Life Sciences Department, Lawrence Berkeley National Laboratory > 1 cyclotron Rd, Berkeley > CA, 94720, USA > Tel: +1-510-486-5202 > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 976 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6