Standardizing gene symbols
2
0
Entering edit mode
Nithisha ▴ 10
@nithisha-14272
Last seen 6.8 years ago

Hello all,

I have a list of gene symbols that I obtained after querying CHEMBL and a separate list of gene symbols from querying datasets from GEO and using Limma on them.

When I try to compare both lists, I find that I cannot look for similar gene symbols as gene synonyms exist. An example would be for microtubule associated protein tau, I have the gene symbols "TAU", "MTBT1" and "MAPTL". Is there anyway I can standardize gene symbols so that I can see how frequently a gene symbol occurs?

Any advice would be appreciated!

R • 2.3k views
ADD COMMENT
0
Entering edit mode

This depends on the organism (species). What is it?

ADD REPLY
1
Entering edit mode

Hi Simon,

Some lists are from mice and some are from humans. Would this cause problems?

ADD REPLY
2
Entering edit mode
@gordon-smyth
Last seen 11 hours ago
WEHI, Melbourne, Australia

The limma functions alias2Symbol() or alias2SymbolTable() do this conversion for you. For example, if you have a column called "Symbol" in your limma results, and you want to convert all the symbols to current official symbols, just use:

fit$genes$Official.Symbol <- alias2SymbolTable(fit$genes$Symbol, species="Hs")

For mouse symbols, use species="Mm".

ADD COMMENT
0
Entering edit mode

Thanks a lot Gordon. 

I do have a few questions though. For your code, I did not completely understand what 

fit$genes$Official.Symbol refers to. Is that pointing to a dataframe within another dataframe?

I used this:

dfl$Official_Symbol <- alias2SymbolTable(df$gene, species="Mm")

where df is a dataframe containing my Limma results and the column "gene" contains my gene symbols.

And I get this following error:

Warning message:
In alias2SymbolTable(df$gene, species = "Mm") :
  Multiple symbols ignored for one or more aliases

I wanted to know if I could do something about the warning?

Thank you!

ADD REPLY
0
Entering edit mode
@markusriester-9875
Last seen 2.5 years ago
United States

For human symbols, HGNChelper might work for you:  
 

checkGeneSymbols(c("TAU", "MTBT1","MAPTL"))

      x Approved Suggested.Symbol

1   TAU    FALSE             MAPT

2 MTBT1    FALSE             MAPT

3 MAPTL    FALSE             MAPT

Warning message:

In checkGeneSymbols(c("TAU", "MTBT1", "MAPTL")) :

  x contains non-approved gene symbols

 

ADD COMMENT
0
Entering edit mode

Thank you! I shall try this out as well.

ADD REPLY

Login before adding your answer.

Traffic: 618 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6