Question: BioMart does not find mouse homologs of human genes
0
gravatar for atakanekiz
9 weeks ago by
atakanekiz30
atakanekiz30 wrote:

Hello BC community,

I am trying to convert a list of human genes to mouse homologs using R. Biomart finds homologs of some genes but not others. I can't seem to figure out the reason for this behavior. An example code is below.

human = useMart("ensembl", dataset = "hsapiens_gene_ensembl")

mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl")

getLDS(attributes = c("external_gene_name"),
       filters = "external_gene_name", values = c("TP53", "MIR155HG", "STAT1", "PDCD1"), mart = human,
       attributesL = c("external_gene_name"), martL = mouse)

#  Gene.name Gene.name.1
#1      TP53       Trp53
#2     PDCD1       Pdcd1
#3     STAT1       Stat1

I tried the query using other arguments such as ensembl_gene_id and hgnc_symbol and the results were the same. I know that MIR155HG should be conserved between human and mouse. This is confirmed by Mir155hg Ensembl page. I have several other genes like this that don't get mapped to the mouse genome for some reason.

What am I missing here?

ADD COMMENTlink modified 9 weeks ago by James W. MacDonald52k • written 9 weeks ago by atakanekiz30

Where on the Ensembl page does it show that MIR155HG is conserved between human and mouse? When I look at the orthologues page it suggests that there are none in any of the 27 primate species - perhaps I'm reading that wrong.

ADD REPLYlink written 9 weeks ago by Mike Smith4.0k

We work on this gene in both human and mouse models. Below is the mouse entry in ensembl:

http://uswest.ensembl.org/Mus_musculus/Gene/Summary?g=ENSMUSG00000097418;r=16:84703167-84715245

ADD REPLYlink written 9 weeks ago by atakanekiz30
Answer: BioMart does not find mouse homologs of human genes
0
gravatar for James W. MacDonald
9 weeks ago by
United States
James W. MacDonald52k wrote:

The biomaRt package is simply a way to programmatically access the Biomart server and get the results back into R. As such, anything like this is really a Biomart issue, not biomaRt (or Bioconductor, really). Anyway, there is a FAQ

ADD COMMENTlink written 9 weeks ago by James W. MacDonald52k

Thanks for the answer. I thought something may be going wrong in the biomaRt rather than the actual database. I'll keep digging to see what's up.

ADD REPLYlink written 9 weeks ago by atakanekiz30

If you look at the link you show above, clicking on orthologs, it says there aren't any human orthologs for that miRNA! The only orthologs are for three other mice species. This is what Mike Smith pointed out. And if you look at the link for MIR155HG, the orthologs link isn't available, which I imagine means there are none.

This is a pretty strong indication (to me, anyway) that Ensembl doesn't think MIR155HG and Mir155hg are orthologs. NCBI has other views on that subject, however.

ADD REPLYlink written 9 weeks ago by James W. MacDonald52k

Wow, interesting! You are right, it looks like Ensembl and NCBI don't agree on this. I will side with NCBI in this case. Do you have a recommendation on how to find orthologs in an all-inclusive manner? I thought Ensembl was the most comprehensive one, but I may be wrong based on this experience.

ADD REPLYlink written 9 weeks ago by atakanekiz30

There's no such thing. NCBI probably has a pretty good rationale for why they think MIR155HG and Mir155hg are orthologs, and I would bet EBI/EMBL has a good rationale for why they think they aren't. And I would also bet that their rationales hinge on pretty subtle, sophisticated points where reasonable people could see both sides and in the end you just have to make a decision as to what you, as a group, are going to do.

This scenario most assuredly propagates through to hundreds if not thousands of genes, where the two groups have landed on different sides of the argument, giving rise to many many differences in what NCBI and EBI/EMBL think are and are not orthologs.

The fact that the two groups don't agree on everything doesn't mean one is right and the other is wrong! They just disagree, based on the given evidence and whatever rules they have instituted to help make decisions when the answer isn't obvious.

Because of that, there isn't a way to find orthologs in an all-inclusive manner. If a gene is 80% homologous in two species, are they orthologs? What about 75%? If one group says > 80% means yes, and the other says > 75% means yes, then you have disagreements, but it's because they are using different cutoffs, and nobody can say for sure which one is right.

ADD REPLYlink written 8 weeks ago by James W. MacDonald52k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 152 users visited in the last hour