The support.bioconductor.org editor has been updated to markdown! Please see more info at: Tutorial: Updated Support Site Editor

Question: Error in getBM using biomaRt
0
gravatar for Oskar
19 days ago by
Oskar50
Oskar50 wrote:

Hi bioconductor support team,

I had run the same script for getting SNPs ID from a genome range by using biomaRt. Despite being able of getting the information needed for several ranges (about 26), there are two of them that didn't work.

Here is the script I've been using.

snp_mart = useMart(biomart = "ENSEMBL_MART_SNP", 
               dataset = "hsapiens_snp",
               host = "grch37.ensembl.org")

snp_id9 = getBM(attributes = c("refsnp_id", "allele","chr_name", "chrom_start", "chrom_end", "chrom_strand"), 
                            filters = c("chr_name", "start", "end"), 
                            values = list(9, 106356922, 107356689), 
                            mart = snp_mart)

snp_id8 = getBM(attributes = c("refsnp_id", "allele","chr_name", "chrom_start", "chrom_end", "chrom_strand"), 
                            filters = c("chr_name", "start", "end"), 
                            values = list(8, 143262170, 144261715), 
                            mart = snp_mart)

I am getting two Error messages when I re-run it - sometimes the first and sometimes the second. As I mentioned, I used exactly the same thing for 26 different ranges and everything went smooth.

Here the first one: Error in getBM(attributes = c("refsnpid", "allele", "chrname", "chrom_start", : The query to the BioMart webservice returned an invalid result: biomaRt expected a character string of length 1. Please report this on the support site at http://support.bioconductor.org

Here the second one: Error in curl::curlfetchmemory(url, handle = handle) : Timeout was reached: Operation timed out after 600000 milliseconds with 77645 bytes received

I would like to get some enlightenment to solve this issue. Thank you very much for your help in advance.

biomart • 76 views
ADD COMMENTlink modified 19 days ago by Mike Smith3.2k • written 19 days ago by Oskar50
Answer: Error in getBM using biomaRt
1
gravatar for Mike Smith
19 days ago by
Mike Smith3.2k
EMBL Heidelberg / de.NBI
Mike Smith3.2k wrote:

I'm not sure why you get differing error message, but I think the root cause is that these queries take too long to run and time out. When using the Ensembl web interface you get 5 minutes before a query dies, and you currently are allowed 10 minutes via biomaRt.

Query time is broadly related to the number of attributes you're returning and the number of values provided to the filters. In this case I think it's the size of the regions that is causing the issue. One approach would be to break the query down into smaller regions, submit each one, and then piece the results back together. Here's an example for your snp_id8:

## create a matrix where each row is a 100kb region
s1 <- seq(143262170, 144261715, by = 100000)
s2 <- c(s1[-1]-1, 144261715)
regions <- matrix(c(rep(chr, length(s1)), s1, s2), ncol = 3)

## wrapper function to be applied to each row of our matrix  
getBM_values <- function(values) {
  getBM(attributes = c("refsnp_id", "allele","chr_name", "chrom_start", "chrom_end", "chrom_strand"), 
        filters = c("chr_name", "start", "end"), 
        values = list(values[1], values[2], values[3]), 
        mart = snp_mart)
}

## query Ensembl for each region & combine results
res_list <- apply(regions, 1, getBM_values)
snp_id8 <- do.call(rbind, res_list)
> dim(snp_id8 )
[1] 218725      6
ADD COMMENTlink written 19 days ago by Mike Smith3.2k
1

Hi Mike - many thanks for your help. It works!!! I've tried creating blocks but not as matrix. Your strategy works just perfect. Than you very much indeed.

ADD REPLYlink written 18 days ago by Oskar50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 198 users visited in the last hour