Convert mouse gene ids to hamster gene ids
1
0
Entering edit mode
rykerklie7 • 0
@rykerklie7-23858
Last seen 3.8 years ago

I would like to convert hamster gene symbols to mouse gene symbols and tried

hamster_genesymbol = annot_chinese_hamster$external_gene_name # gene list to lift over

genesV2_hamstertomouse = getLDS(attributes = c("CHO_symbol"), 
             filters = "CHO_symbol", 
             values = hamster_genesymbol , 
             mart = hamster, 
             attributesL = c("mgi_symbol"), 
             martL = mouse, 
             uniqueRows=T)

Error in getLDS(attributes = c("CHO_symbol"), filters = "CHO_symbol",  : 
Invalid attribute(s): CHO_symbol 
Please use the function 'listAttributes' to get valid attribute names

I know the common symbols such as mgi for mouse, rgd for rat and so on, not sure what it is for the chinese hamster and I was unable to find using listattributes.

biomart R ensembl • 2.5k views
ADD COMMENT
1
Entering edit mode
Kevin Blighe ★ 3.9k
@kevin
Last seen 7 hours ago
Republic of Ireland

Hey,

You can search the biomaRt (Ensembl) datasets like this:

require(biomaRt)
listDatasets(useMart('ensembl'))

Set-up

There seems to be 3 Cricetulus griseus (C. griseus) datasets - I'll just use the first one:

datasets <- listDatasets(useMart('ensembl'))
datasets[grep('Chinese', datasets[,2]),]
                     dataset                                  description
27  cgchok1gshd_gene_ensembl Chinese hamster CHOK1GS genes (CHOK1GS_HDv1)
28     cgcrigri_gene_ensembl    Chinese hamster CriGri genes (CriGri_1.0)
30       cgpicr_gene_ensembl     Chinese hamster PICR genes (CriGri-PICR)
151   psinensis_gene_ensembl  Chinese softshell turtle genes (PelSin_1.0)
         version
27  CHOK1GS_HDv1
28    CriGri_1.0
30   CriGri-PICR
151   PelSin_1.0

hamster <- useMart('ensembl', dataset = 'cgchok1gshd_gene_ensembl')
mouse <- useMart('ensembl', dataset = 'mmusculus_gene_ensembl')

Create a Chinese Hamster (C. griseus) lookup table

For C. griseus, the standard headers (attributes) seem to be used:

table <- getBM(
  attributes = c('ensembl_gene_id','external_gene_name'),
  mart = hamster)
head(table[table$external_gene_name != '',], 30)

      ensembl_gene_id external_gene_name
6  ENSCGRG00001000006                ND1
9  ENSCGRG00001000009              mt-Tm
10 ENSCGRG00001000010                ND2
16 ENSCGRG00001000016               COX1
19 ENSCGRG00001000019               COX2
21 ENSCGRG00001000021               ATP8
22 ENSCGRG00001000022               ATP6
23 ENSCGRG00001000023               COX3
25 ENSCGRG00001000025                ND3
27 ENSCGRG00001000027               ND4L
28 ENSCGRG00001000028                ND4
32 ENSCGRG00001000032                ND5
33 ENSCGRG00001000033                ND6
35 ENSCGRG00001000035               CYTB
41 ENSCGRG00001000041               Elp6
44 ENSCGRG00001000044             Zfp449
46 ENSCGRG00001000046               Utp6
47 ENSCGRG00001000047              Ccng2
48 ENSCGRG00001000048             Tespa1
49 ENSCGRG00001000049              Tcea2
50 ENSCGRG00001000050              Rad21
51 ENSCGRG00001000051              Ednrb
52 ENSCGRG00001000052             Tmem98
53 ENSCGRG00001000053              Prok1
54 ENSCGRG00001000054            Emilin3
55 ENSCGRG00001000055               Dna2
57 ENSCGRG00001000057                Lpp
59 ENSCGRG00001000059              Brcc3
61 ENSCGRG00001000061             Rmnd5a
62 ENSCGRG00001000062                Gfy

Now map between Chinese Hamster (C. griseus) and Mouse (M. musculus)

So, now we can map to mouse:

getLDS(
  mart = hamster,
  attributes = c('ensembl_gene_id','external_gene_name','chromosome_name'),
  martL = mouse,
  attributesL = c('mgi_symbol','ensembl_gene_id','chromosome_name','gene_biotype'),
  filters = 'external_gene_name',
  values = c('COX1', 'COX2','Rad21','Dna2','Brcc3'))

      Gene.stable.ID Gene.name Chromosome.scaffold.name MGI.symbol
1 ENSCGRG00001000016      COX1                       MT     mt-Co1
2 ENSCGRG00001000019      COX2                       MT     mt-Co2
3 ENSCGRG00001000055      Dna2              scaffold_33       Dna2
4 ENSCGRG00001000050     Rad21              scaffold_34      Rad21
5 ENSCGRG00001000059     Brcc3              scaffold_11      Brcc3
    Gene.stable.ID.1 Chromosome.scaffold.name.1      Gene.type
1 ENSMUSG00000064351                         MT protein_coding
2 ENSMUSG00000064354                         MT protein_coding
3 ENSMUSG00000036875                         10 protein_coding
4 ENSMUSG00000022314                         15 protein_coding
5 ENSMUSG00000031201                          X protein_coding

Note the other solution via Orthology.eg.db, mentioned by James: biomart getLDS giving errors.

Keviin

ADD COMMENT
0
Entering edit mode

Is using gene names preferable to looking up orthologues?

ADD REPLY
0
Entering edit mode

Good question

ADD REPLY

Login before adding your answer.

Traffic: 936 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6