I was using the Biomart package just before I updated my R and it was working fine, now after the update I get this message, so I changed my code to what I found here https://support.bioconductor.org/p/50102/ and then got a different error:
Here is my original code that worked on old version of R:
> mart <- useMart(biomart = "ensembl", dataset = "hsapiens_gene_ensembl")
> results <- getBM(attributes => c("chromosome_name","external_gene_name","ensembl_gene_id"),
> filters = "ensembl_gene_id", values = signif_sb_genes_muscle$ensembl_gene_id,
> mart = mart)
This is the error I receive with this old code:
> Error: failed to load external entity "http://www.ensembl.org/info/website/archives/index.html?redirect=no"
Here is the new edited code:
> mart <- useMart("ENSEMBL_MART_ENSEMBL","hsapiens_gene_ensembl",
> host="www.ensembl.org")
> mart@host="http://may2012.archive.ensembl.org:80/biomart/martservice"
> mart = useDataset("hsapiens_gene_ensembl",mart=mart)
>
> results <- getBM(attributes =c("chromosome_name","external_gene_id","ensembl_gene_id"),
> filters = "ensembl_gene_id", values = signif_sb_genes_muscle$ensembl_gene_id,
> mart = mart)
But it is now giving this error:
> Error in .processResults(postRes, mart = mart, sep = sep, fullXmlQuery
> = fullXmlQuery, : The query to the BioMart webservice returned an invalid result. The number of columns in the result table does not
> equal the number of attributes in the query. Please report this on the
> support site at http://support.bioconductor.org
I am reporting the error as instructed, and I'd also like to retrieve the gene names as was initially working. Thank you:)
Here is my session info:
>sessionInfo() R version 4.0.2 (2020-06-22)
> Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.5
> LTS
>
> Matrix products: default BLAS:
> /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1 LAPACK:
> /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
>
> locale: [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C
> LC_TIME=en_AU.UTF-8 [4] LC_COLLATE=en_AU.UTF-8
> LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8 [7]
> LC_PAPER=en_AU.UTF-8 LC_NAME=C LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_AU.UTF-8
> LC_IDENTIFICATION=C
>
> attached base packages: [1] stats graphics grDevices utils
> datasets methods base
>
> other attached packages: [1] biomaRt_2.44.1
>
> loaded via a namespace (and not attached): [1] Rcpp_1.0.5
> compiler_4.0.2 pillar_1.4.6 dbplyr_1.4.4 [5]
> viridis_0.5.1 prettyunits_1.1.1 tools_4.0.2
> progress_1.2.2 [9] digest_0.6.25 bit_4.0.4
> viridisLite_0.3.0 gtable_0.3.0 [13] RSQLite_2.2.1
> memoise_1.1.0 BiocFileCache_1.12.1 tibble_3.0.3 [17]
> lifecycle_0.2.0 pkgconfig_2.0.3 rlang_0.4.8
> DBI_1.1.0 [21] rstudioapi_0.11 curl_4.3
> parallel_4.0.2 gridExtra_2.3 [25] stringr_1.4.0
> httr_1.4.2 dplyr_1.0.2 rappdirs_0.3.1 [29]
> generics_0.0.2 S4Vectors_0.26.1 vctrs_0.3.4
> askpass_1.1 [33] IRanges_2.22.2 hms_0.5.3
> grid_4.0.2 tidyselect_1.1.0 [37] stats4_4.0.2
> bit64_4.0.5 glue_1.4.2 Biobase_2.48.0 [41]
> R6_2.4.1 AnnotationDbi_1.50.3 XML_3.99-0.5
> ggplot2_3.3.2 [45] purrr_0.3.4 blob_1.2.1
> magrittr_1.5 scales_1.1.1 [49] ellipsis_0.3.1
> BiocGenerics_0.34.0 assertthat_0.2.1 colorspace_1.4-1 [53]
> stringi_1.5.3 munsell_0.5.0 openssl_1.4.3
> crayon_1.3.4
I am getting the same errors on R 3.6.2
Using the GRCh38 build of the human genome:
getBM returns:
while trying to pull the GRCh37 (i.e. hg19) build information (my standard pipeline):
I'm getting:
Regarding the
This looks like Ensembl may have stopped the ability to access their site via http. Can you try installing the developmental version of biomaRt from GitHub? You can do that via:
I'll take a look at why you get a different error for the GRCh37 archive. Normally the 'mismatch columns' error indicates that Ensembl is returning something completely different like an error page, rather than a table of results.
I'm getting the same error on R 3.6.2. After installing the development version, I get
I would recommend creating the GRCh37 mart with:
You can probably modify the
host
argument to usehttps
if you want to keep the existing code (see my answer below for more details), but this should deal with a few of the issues for you.Thanks for your help Mike!
Unfortunately it seems the
useMart
function isn't recognizing theGRCh="37"
argument. When I set the mart with the code as you suggest, I get:However, specifying the https host seems to do the trick:
update: now on R 4.0.2 and biomaRt 2.45.2:
The function is
useEnsembl()
rather thanuseMart()
.Internally they are very similar, and as you've seen
useMart()
will do the job, butuseMart()
has to be very generic as it was developed when there were many different BioMart servers, with lots of different configurations.On the other hand
useEnsembl()
has some arguments and defaults that are specific to accessing Ensembl that might make your code a little neater - but if it's working then no need to change things.Awesome! Thanks again Mike!! (-1 for my attention to detail...)
Hello, I am able to use useEnsembl to get the data.
When using getBM, I was getting an error
I am so confused and am grateful for any help.