Thanks, Steffen.
That was exactly what I did. I was doing 10000 at a time, just to be
safe.
...Tao
----- Original Message ----
From: "steffen@stat.Berkeley.EDU" <steffen@stat.berkeley.edu>
To: "Shi, Tao" <shidaxia at="" yahoo.com="">
Cc: bioconductor at stat.math.ethz.ch
Sent: Friday, August 1, 2008 3:09:40 PM
Subject: Re: [BioC] biomaRt:getBM error when query is large
Hi Tao,
I haven't hit a limit yet but you might have. 430.000 ids is quite
large.
Try to split your query in a few batches of e.g. 100.000 or 50.000
long
(you should not need to go below this length).
I would also put
Sys.sleep(1)
between each query so you won't get into trouble of sending a
subsequent
querying the server to fast after an earlier query.
I bet:
tmp1 <- getBM(c("ensembl_gene_stable_id", "refsnp_id",
"allele","chr_name", "chrom_start", "chrom_strand"),filters =
"refsnp",
values = rs[1:100000], mart = mart)
Sys.sleep(1)
tmp2 <- getBM(c("ensembl_gene_stable_id", "refsnp_id",
"allele","chr_name", "chrom_start", "chrom_strand"),filters =
"refsnp",
values = rs[100000:200000], mart = mart)
Sys.sleep(1)
tmp3 <- getBM(c("ensembl_gene_stable_id", "refsnp_id",
"allele","chr_name", "chrom_start", "chrom_strand"),filters =
"refsnp",
values = rs[200000:300000], mart = mart)
Sys.sleep(1)
tmp4 <- getBM(c("ensembl_gene_stable_id", "refsnp_id",
"allele","chr_name", "chrom_start", "chrom_strand"),filters =
"refsnp",
values = rs[300000:430000], mart = mart)
all = rbind(tmp1,tmp2,tmp3,tmp4)
Should do it.
Cheers,
Steffen
> Hi list,
>
> See the sample codes below, where "rs" is a char vector containing
~430000
> rs IDs. However, when I ran the query 10000 at a time, it worked.
Is
> there a query limit for biomaRt?
>
> Thanks,
>
> ...Tao
>
>
>
>> tmp <- getBM(c("ensembl_gene_stable_id", "refsnp_id", "allele",
>> "chr_name", "chrom_start", "chrom_strand"),
> + filters = "refsnp", values = rs, mart = mart)
> Error in postForm(paste(martHost(mart), "?", sep = ""), query =
xmlQuery)
> :
> Empty reply from server
>
>> sessionInfo()
> R version 2.7.0 (2008-04-22)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] tools stats graphics grDevices utils datasets
methods
> base
>
> other attached packages:
> [1] biomaRt_1.14.0 RCurl_0.9-3 GO.db_2.2.0
> AnnotationDbi_1.2.2 RSQLite_0.6-9 DBI_0.2-4
Biobase_2.0.1
>
> loaded via a namespace (and not attached):
> [1] XML_1.95-2
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
>
https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>