Why does biomart work on humans and not mouse genes for HGNC symbol conversion?
1
0
Entering edit mode
ap3637 • 0
@ap3637-13814
Last seen 5.6 years ago

I am trying to use biomart for some ID mapping between humans and mice.  I have a couple of sets of genes I'm interested in and want to bring the IDs into a common space by mapping both gene ID sets to their corresponding HGNC symbol.  

> head(human_de_geneIDs)
[1] "ENSG00000175518" "ENSG00000237075" "ENSG00000143545" "ENSG00000182118" "ENSG00000164035" "ENSG00000125355"
> head(mouse_de_geneIDs)
[1] "ENSMUSG00000017830" "ENSMUSG00000069792" "ENSMUSG00000111967" "ENSMUSG00000029371" "ENSMUSG00000018927" "ENSMUSG00000107653"

 

#Set up my marts for human and mouse.

mart1 = useMart("ensembl", dataset="hsapiens_gene_ensembl")

mart2 = useMart("ensembl", dataset="mmusculus_gene_ensembl")

 

# human / mouse id ortholog map.

orthoMap_human <- getLDS(attributes=c("ensembl_gene_id", "hgnc_symbol"), filters="ensembl_gene_id", values=human_de_geneIDs, mart=mart1, attributesL=c("ensembl_gene_id"), martL=mart2, valuesL=mouse_de_geneIDs, uniqueRows=FALSE)

orthoMap_mouse <- getLDS(attributes=c("ensembl_gene_id", "hgnc_symbol"), filters="ensembl_gene_id", values=mouse_de_geneIDs, mart=mart2, attributesL=c("ensembl_gene_id"), martL=mart1, valuesL=human_de_geneIDs, uniqueRows=FALSE)

 

The human orthoMap here looks like what I would expect, a human id, a hgnc symbol, and a 

> head(orthoMap_human)
   Gene.stable.ID HGNC.symbol   Gene.stable.ID.1
1 ENSG00000105976         MET ENSMUSG00000009376
2 ENSG00000142611      PRDM16 ENSMUSG00000039410
3 ENSG00000164283        ESM1 ENSMUSG00000042379
4 ENSG00000200795      RNU4-1 ENSMUSG00000096243
5 ENSG00000197506     SLC28A3 ENSMUSG00000021553
6 ENSG00000088899       LZTS3 ENSMUSG00000037703

 

The mouse orthoMap, however, gives only NA for HGNC symbol.

> head(orthoMap_mouse)

      Gene.stable.ID HGNC.symbol Gene.stable.ID.1
1 ENSMUSG00000091650          NA  ENSG00000128335
2 ENSMUSG00000050982          NA  ENSG00000128335
3 ENSMUSG00000063779          NA  ENSG00000134216
4 ENSMUSG00000056529          NA  ENSG00000169403
5 ENSMUSG00000040809          NA  ENSG00000134216
6 ENSMUSG00000040253          NA  ENSG00000162654

 

I used the exact same code to get these, I just switched mart1 and mart2 in the getLDS function.  Why does it work one direction and not the other?  Also, I'm pretty sure the first time I tried this it worked fine.  Has anyone encountered this issue before or is there possibly something i've overlooked in the code that would cause this result?  

biomart r homologue • 4.3k views
ADD COMMENT
2
Entering edit mode
swbarnes2 ★ 1.4k
@swbarnes2-14086
Last seen 23 hours ago
San Diego

I'm going to guess your problem is this:

https://www.genenames.org/

"HGNC is responsible for approving unique symbols and names for human loci, including protein coding genes, ncRNA genes and pseudogenes, to allow unambiguous scientific communication."

 

When I use the little search bar in the corner of the website with human ensembl gene IDs, I get gene names.  When I search mouse ensembl IDs, I don't.  

So "hgnc_symbol" might not return gene names for mouse ensembl gene IDs.

ADD COMMENT
1
Entering edit mode

Exactly. MGI is in charge of mouse symbols

> orthoMap_mouse <- getLDS(attributes=c("ensembl_gene_id", "mgi_symbol"), filters="ensembl_gene_id", values=mouse_de_geneIDs, mart=mart2, attributesL=c("ensembl_gene_id"), martL=mart1, valuesL=human_de_geneIDs, uniqueRows=FALSE)
> orthoMap_mouse
      Gene.stable.ID MGI.symbol Gene.stable.ID.1
1 ENSMUSG00000029371      Cxcl5  ENSG00000163735
2 ENSMUSG00000017830      Dhx58  ENSG00000108771
3 ENSMUSG00000018927       Ccl6  ENSG00000275718
4 ENSMUSG00000018927       Ccl6  ENSG00000275688
5 ENSMUSG00000029371      Cxcl5  ENSG00000124875
6 ENSMUSG00000018927       Ccl6  ENSG00000274736
ADD REPLY
0
Entering edit mode

Thanks very much, you're right on the money!

ADD REPLY

Login before adding your answer.

Traffic: 731 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6