Entering edit mode
Hello, forum,
I want to map the mouse transcript id like ENSMUST00000159265.1 to the human transcript id which starts with "ENST" using the biomaRt, but I failed. What is wrong with my code, or is the idea unrealistic?
My code is as follows:
library("biomaRt")
human <- useMart("ensembl", dataset = "hsapiens_gene_ensembl", host = "https://dec2021.archive.ensembl.org/")
mouse <- useMart("ensembl", dataset = "mmusculus_gene_ensembl", host = "https://dec2021.archive.ensembl.org/")
# View(listAttributes(mouse))
# View(listAttributes(human))
getLDS(attributes = c("mgi_trans_name"), filters = "mgi_trans_name",
values = c("ENSMUST00000159265.1") , mart = mouse,
attributesL = c("hgnc_trans_name"), martL = human,
uniqueRows=T)
getLDS(attributes = c("mgi_symbol"), filters = "mgi_symbol",
values = c("Xkr4") , mart = mouse,
attributesL = c("hgnc_symbol"), martL = human,
uniqueRows=T)
getLDS(attributes = c("mgi_id"), filters = "mgi_id",
values = c("ENSMUST00000159265.1") , mart = mouse,
attributesL = c("hgnc_id"), martL = human,
uniqueRows=T)
It works very well, and thanks for such a great answer! Because I use the old version gtf from GENECODE (vM12), the id version does not match the version stored in Ensembl. So I use the transcript id (without a version) for conversion finally.
It might be difficult to do the mapping using version numbers. Gencode m12 is based on Ensembl release 87, which is from 2016. There are archived versions of Biomart that you can query, but they are not that fine grained, so your choices would be Ensembl 80 or 91. That said, the mapping between species shouldn't depend that much on the version numbers anyway.