Error in curl; timeout when using biomaRt
1
0
Entering edit mode
@analeonpalacio-18199
Last seen 9 weeks ago
Spain

I am trying to get the variants affecting a specific gene but I am having the following error when executing getBM

Error in curl::curl_fetch_memory(url, handle = handle) : Timeout was reached: [www.ensembl.org:443] Operation timed out after 300004 milliseconds with 0 bytes received

This is the executed code... I tried with different positions and chromosomes but I always get the same error. Filtering by gene identifier also fails.

ensembl <- biomaRt::useEnsembl("snp", dataset = "hsapiens_snp")
data = biomaRt::getBM(mart = ensembl, attributes = c('refsnp_id'), filters = c('chr_name','start','end'), values = list("X", "101345661", "101348742"))
biomaRt bioma • 106 views
ADD COMMENT
0
Entering edit mode
Mike Smith ★ 5.2k
@mike-smith
Last seen 12 hours ago
EMBL Heidelberg / de.NBI

I see the same problem. Normally if you see the error Operation timed out after 300004 milliseconds with 0 bytes received it indicates that you're asking for too much information. Essentially the Ensembl server is still calculating things when the 5 minute limit for you query runs out. You could try reducing the size of the region, and submit several queries to span everything you want.

However I'd suggest using the Ensembl REST API, which seems to work much better for me. There's no specific package for this, but here's an example function that might be a good place to start.

library(httr)
library(jsonlite)

getVariantsInRegion <- function(chrom, start, end) {

    region <- sprintf("%s:%s-%s",chrom,start,end)
    server <- "https://rest.ensembl.org/overlap/region/human/"
    extension <- "?feature=variation"
    full_url <- paste0(server, region, extension)

    result <- GET(full_url, content_type("application/json"))
    stop_for_status(result)

    fromJSON(toJSON(content(result)))
}

vars <- getVariantsInRegion(chrom = "X", start = "101345661", end = "101348742")

head(vars)
#>         end assembly_name source seq_region_name           id   alleles
#> 1 101345667        GRCh38  dbSNP               X rs1242761872      C, A
#> 2 101345673        GRCh38  dbSNP               X  rs782301849 TTATAT, T
#> 3 101345669        GRCh38  dbSNP               X rs1602994716      T, A
#> 4 101345671        GRCh38  dbSNP               X  rs375049891      T, C
#> 5 101345672        GRCh38  dbSNP               X rs1602994729      A, C
#> 6 101345686        GRCh38  dbSNP               X rs1396336725      A, T
#>   feature_type    consequence_type clinical_significance     start strand
#> 1    variation 3_prime_UTR_variant                  NULL 101345667      1
#> 2    variation 3_prime_UTR_variant                  NULL 101345668      1
#> 3    variation 3_prime_UTR_variant                  NULL 101345669      1
#> 4    variation 3_prime_UTR_variant                  NULL 101345671      1
#> 5    variation 3_prime_UTR_variant                  NULL 101345672      1
#> 6    variation 3_prime_UTR_variant                  NULL 101345686      1
ADD COMMENT

Login before adding your answer.

Traffic: 399 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6