biomaRt not returning correct lists
1
0
Entering edit mode
@nkuperwasser-22871
Last seen 3.7 years ago

Hello, I am new to using biomaRt for conversting ensemble gene ids into gene symbols and I have been looking at may quesitons/answers ...bu I am getting this weird result. When I query individually my gene ids, getBM() works perfectly. However, when I send a large vector of values, all the values it returns the wrong ids (i've put some results below):

ensmart <- useMart("ENSEMBL_MART_ENSEMBL")
ensmart <- useDataset("mmusculus_gene_ensembl", ensmart)

This is the head of the list I am sending:

> head(sigres$Genid)
[1] "ENSMUSG00000051951" "ENSMUSG00000103377" "ENSMUSG00000104017" "ENSMUSG00000103161" "ENSMUSG00000102331"
[6] "ENSMUSG00000025902"

and the getBM command:

> head(getBM(attributes=c("external_gene_name","ensembl_gene_id"),
+            filters="ensembl_gene_id",values=sigres$Genid,mart=ensmart))

returns this:

  external_gene_name    ensembl_gene_id                                                                             
1             Gpr107 ENSMUSG00000000194
2               Rem1 ENSMUSG00000000359
3              Pxmp4 ENSMUSG00000000876
4              Rrp15 ENSMUSG00000001305
5              Ube2c ENSMUSG00000001403
6              Aif1l ENSMUSG00000001864

However, when using only one entry:

> head(getBM(attributes=c("external_gene_name","ensembl_gene_id"),
+            filters="ensembl_gene_id",values="ENSMUSG00000051951",mart=ensmart))
  external_gene_name    ensembl_gene_id
1               Xkr4 ENSMUSG00000051951

it returns the correct entry....

Is there something that is happening when I send a vector versus individually? I have over 6000 entries, and doing these individually is not feasible....

Thank you in advance!! sorry...forgot to add:

R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.2

 [1] biomaRt_2.42.0              DESeq2_1.26.0               BiocManager_1.30.10         SummarizedExperiment_1.16.1
 [5] DelayedArray_0.12.2         BiocParallel_1.20.1         matrixStats_0.55.0          Biobase_2.46.0             
 [9] GenomicRanges_1.38.0        GenomeInfoDb_1.22.0         IRanges_2.20.2              S4Vectors_0.24.3           
[13] BiocGenerics_0.32.0 
biomart annotation • 628 views
ADD COMMENT
0
Entering edit mode

So you are sure that ENSMUSG00000000194 isn't in your full list?

ADD REPLY
0
Entering edit mode
> testnames <- sigres2$Genid[1:10]
> testnames
 [1] "ENSMUSG00000051951" "ENSMUSG00000103377" "ENSMUSG00000104017" "ENSMUSG00000103161" "ENSMUSG00000102331"
 [6] "ENSMUSG00000025902" "ENSMUSG00000002459" "ENSMUSG00000033793" "ENSMUSG00000090031" "ENSMUSG00000051285"
> getBM(attributes=c("external_gene_name","ensembl_gene_id"),
+       filters="ensembl_gene_id",values=testnames,mart=ensmart)
   external_gene_name    ensembl_gene_id
1               Rgs20 ENSMUSG00000002459
2               Sox17 ENSMUSG00000025902
3             Atp6v1h ENSMUSG00000033793
4              Pcmtd1 ENSMUSG00000051285
5                Xkr4 ENSMUSG00000051951
6       4732440D04Rik ENSMUSG00000090031
7             Gm19938 ENSMUSG00000102331
8             Gm38148 ENSMUSG00000103161
9             Gm37180 ENSMUSG00000103377
10            Gm37363 ENSMUSG00000104017

Ah, the are indeed there...not in the same order that I sent them...that's why I they appear to be different since I was only showing the first 5. When I do a small test, they are indeed all there. Thanks!! Is there a way to return them in the same order sent? Not crucial as I can always merge at the end.

NK

ADD REPLY
0
Entering edit mode
Mike Smith ★ 6.5k
@mike-smith
Last seen 2 hours ago
EMBL Heidelberg

Unfortunately this is a 'feature' of the BioMart service, so there's nothing the biomaRt package can do to return results in a specific order. The best option is to do as you have done in you example and make sure you include the filter fields in the the attributes you return, and then use match() or something similar to order the results in the same way as your original query set.

ADD COMMENT

Login before adding your answer.

Traffic: 983 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6