Probe to gene id conversion
1
1
Entering edit mode
rafi A ▴ 20
@rafi-a-6336
Last seen 10.1 years ago
United States

Hi all,

I use the “annotate” package to convert affymetrix 3’ probe to gene ids. For some probes it does not find corresponding gene ids. For example, searching for “1368587_at” and “1385248_a_at” gives ‘NA’.

library("annotate")
library("rat2302.db")
getEG(c("1368587_at","1385248_a_at"),"rat2302")
getSYMBOL(c("1368587_at","1385248_a_at"),"rat2302")

But when I google "1385248_a_at", I get gene symbol “Ogn” (gene id: 291015) in rat genome database.

When I used biomaRt - "1385248_a_at" mapped to two gene ids: “291015”, “100910855”. But gene id 291015 (Ogn) seems more relevant, the other id was LOC100910855.

I assume it is a common problem.

Is there any automated way to check for missing probe-gene mapping after using “annotate”?

(or) What is the best way to check if the probe id is really missing annotation or not?

I tried extracting “NA” rows and passing them to biomaRt. Biomart produces a list with many duplicates (many 2nd mapping to LOC…) that still needs to be checked manually. Is there any alternate way?

Your suggestion will be helpful. Thanks for your time,

Rafi

sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] biomaRt_2.20.0       rat2302.db_2.14.0    org.Rn.eg.db_2.14.0  RSQLite_0.11.4      
 [5] DBI_0.3.1            annotate_1.42.1      eisa_1.16.0          AnnotationDbi_1.26.1
 [9] GenomeInfoDb_1.0.2   Biobase_2.24.0       BiocGenerics_0.10.0  isa2_0.3.3          
[13] RankProd_2.36.0      BiocInstaller_1.14.3

loaded via a namespace (and not attached):
 [1] Category_2.30.0   genefilter_1.46.1 graph_1.42.0      grid_3.1.1        GSEABase_1.26.0  
 [6] IRanges_1.22.10   lattice_0.20-29   Matrix_1.1-4      RBGL_1.40.1       RCurl_1.95-4.3   
[11] splines_3.1.1     stats4_3.1.1      survival_2.37-7   tools_3.1.1       XML_3.98-1.1     
[16] xtable_1.7-4
annotate biomart probe to gene id • 11k views
ADD COMMENT
4
Entering edit mode
@james-w-macdonald-5106
Last seen 10 hours ago
United States

You should use select(), rather than the (very old) methods in the annotate package.

> select(rat2302.db, c("1368587_at","1385248_a_at"), c("SYMBOL","ENTREZID", "GENENAME"))
       PROBEID       SYMBOL  ENTREZID                GENENAME
1   1368587_at        Apoc1     25292      apolipoprotein C-I
2   1368587_at LOC100911905 100911905 apolipoprotein C-I-like
3 1385248_a_at          Ogn    291015             osteoglycin
Warning message:
In .generateExtraRows(tab, keys, jointype) :
  'select' resulted in 1:many mapping between keys and return rows

The old way of mapping probesets to genes would not return anything if there was a one-to-many mapping. Now you get the results, and it is up to you to decide how to resolve the ambiguity.

ADD COMMENT
0
Entering edit mode

Thanks, very helpful -R

ADD REPLY

Login before adding your answer.

Traffic: 466 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6