Search
Question: Probe to gene id conversion
1
gravatar for rafi A
4.1 years ago by
rafi A20
United States
rafi A20 wrote:

Hi all,

I use the “annotate” package to convert affymetrix 3’ probe to gene ids. For some probes it does not find corresponding gene ids. For example, searching for “1368587_at” and “1385248_a_at” gives ‘NA’.

library("annotate")
library("rat2302.db")
getEG(c("1368587_at","1385248_a_at"),"rat2302")
getSYMBOL(c("1368587_at","1385248_a_at"),"rat2302")

But when I google "1385248_a_at", I get gene symbol “Ogn” (gene id: 291015) in rat genome database.

When I used biomaRt - "1385248_a_at" mapped to two gene ids: “291015”, “100910855”. But gene id 291015 (Ogn) seems more relevant, the other id was LOC100910855.

I assume it is a common problem.

Is there any automated way to check for missing probe-gene mapping after using “annotate”?

(or) What is the best way to check if the probe id is really missing annotation or not?

I tried extracting “NA” rows and passing them to biomaRt. Biomart produces a list with many duplicates (many 2nd mapping to LOC…) that still needs to be checked manually. Is there any alternate way?

Your suggestion will be helpful. Thanks for your time,

Rafi

sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] biomaRt_2.20.0       rat2302.db_2.14.0    org.Rn.eg.db_2.14.0  RSQLite_0.11.4      
 [5] DBI_0.3.1            annotate_1.42.1      eisa_1.16.0          AnnotationDbi_1.26.1
 [9] GenomeInfoDb_1.0.2   Biobase_2.24.0       BiocGenerics_0.10.0  isa2_0.3.3          
[13] RankProd_2.36.0      BiocInstaller_1.14.3

loaded via a namespace (and not attached):
 [1] Category_2.30.0   genefilter_1.46.1 graph_1.42.0      grid_3.1.1        GSEABase_1.26.0  
 [6] IRanges_1.22.10   lattice_0.20-29   Matrix_1.1-4      RBGL_1.40.1       RCurl_1.95-4.3   
[11] splines_3.1.1     stats4_3.1.1      survival_2.37-7   tools_3.1.1       XML_3.98-1.1     
[16] xtable_1.7-4
ADD COMMENTlink modified 4.1 years ago by James W. MacDonald48k • written 4.1 years ago by rafi A20
4
gravatar for James W. MacDonald
4.1 years ago by
United States
James W. MacDonald48k wrote:

You should use select(), rather than the (very old) methods in the annotate package.

> select(rat2302.db, c("1368587_at","1385248_a_at"), c("SYMBOL","ENTREZID", "GENENAME"))
       PROBEID       SYMBOL  ENTREZID                GENENAME
1   1368587_at        Apoc1     25292      apolipoprotein C-I
2   1368587_at LOC100911905 100911905 apolipoprotein C-I-like
3 1385248_a_at          Ogn    291015             osteoglycin
Warning message:
In .generateExtraRows(tab, keys, jointype) :
  'select' resulted in 1:many mapping between keys and return rows

The old way of mapping probesets to genes would not return anything if there was a one-to-many mapping. Now you get the results, and it is up to you to decide how to resolve the ambiguity.

ADD COMMENTlink written 4.1 years ago by James W. MacDonald48k

Thanks, very helpful -R

ADD REPLYlink written 4.1 years ago by rafi A20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 245 users visited in the last hour