Entering edit mode
I have a character vector of approximately 25000 genomic regions I
want
to retrieve additional genomic information for using biomaRt
R-package:
library("biomaRt")
mart = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
regions = c(
"1:661517:668171",
"1:787463:794591"
...)
attribs = c("chromosome_name",
"start_position",
"end_position",
"strand",
"ensembl_gene_id",
"hgnc_symbol")
result = getBM(filters="chromosomal_region", values=regions,
attributes=attribs, mart=mart)
The above query seems to run forever (3 days +), but when splitting
the
regions vector in two halves each is finished after approximately 15
minutes.
Am I doing something wrong or is there an upper limit for the number
of
regions in a biomart query?
Thanks in advance,
Christian
> sessionInfo()
R version 2.11.0 (2010-04-22)
x86_64-pc-linux-gnu
locale:
[1] C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] biomaRt_2.4.0 IRanges_1.6.8 affy_1.26.1 Biobase_2.8.0
loaded via a namespace (and not attached):
[1] RCurl_1.4-2 XML_3.1-0 affyio_1.16.0
[4] preprocessCore_1.10.0