Question: Selecting few ids from reactome.db
1
gravatar for Lluís Revilla Sancho
2.6 years ago by
European Union
Lluís Revilla Sancho500 wrote:

I am experiencing some problems with select, while testing for a new package:

library("reactome.db")

genes.id <- as.character(c(52, 11342, 80895, 57654, 58493, 1164, 1163, 4150,  2130, 159))

select(reactome.db, keys = genes.id, keytype = "ENTREZID", columns = "REACTOMEID")

'select()' returned 1:many mapping between keys and columns
   ENTREZID REACTOMEID
1        52       <NA>
2     11342       <NA>

# Some data as expected

select(reactome.db, keys = genes.id[1:5], keytype = "ENTREZID", columns = "REACTOMEID")
Error in .testForValidKeys(x, keys, keytype, fks) :
  None of the keys entered are valid keys for 'ENTREZID'. Please use the keys method to see a listing of valid arguments.
> select(reactome.db, keys = genes.id[1:7], keytype = "ENTREZID", columns = "REACTOMEID")

'select()' returned 1:many mapping between keys and columns
   ENTREZID REACTOMEID
1        52       <NA>
2     11342       <NA>

# Some data as expected

> select(reactome.db, keys = genes.id[1:6], keytype = "ENTREZID", columns = "REACTOMEID")
Error in .testForValidKeys(x, keys, keytype, fks) :
  None of the keys entered are valid keys for 'ENTREZID'. Please use the keys method to see a listing of valid arguments.
> select(reactome.db, keys = genes.id[2:7], keytype = "ENTREZID", columns = "REACTOMEID")
'select()' returned 1:many mapping between keys and columns
   ENTREZID REACTOMEID
1     11342       <NA>
2     80895       <NA>
3     57654       <NA> # Some data

> select(reactome.db, keys = genes.id[3:6], keytype = "ENTREZID", columns = "REACTOMEID")
Error in .testForValidKeys(x, keys, keytype, fks) :
  None of the keys entered are valid keys for 'ENTREZID'. Please use the keys method to see a listing of valid arguments.
> traceback()
7: stop(msg)
6: .testForValidKeys(x, keys, keytype, fks)
5: testSelectArgs(x, keys = keys, cols = cols, keytype = keytype)
4: .selectReact(x, keys, columns, keytype)
3: .selectWarnReact(x, keys, columns, keytype, kt = kt, ...)
2: select(reactome.db, keys = genes.id[3:6], keytype = "ENTREZID",
       columns = "REACTOMEID")
1: select(reactome.db, keys = genes.id[3:6], keytype = "ENTREZID",
       columns = "REACTOMEID")
>sessionInfo()

R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=es_ES.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=es_ES.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
[1] reactome.db_1.58.0   AnnotationDbi_1.36.0 IRanges_2.8.1       
[4] S4Vectors_0.12.0     Biobase_2.34.0       BiocGenerics_0.20.0

loaded via a namespace (and not attached):
[1] DBI_0.5-1     memoise_1.0.0 Rcpp_0.12.8   RSQLite_1.1-1 digest_0.6.10

Is this the expected behavior?

annotationdbi bug reactome.db • 607 views
ADD COMMENTlink modified 2.6 years ago by Martin Morgan ♦♦ 23k • written 2.6 years ago by Lluís Revilla Sancho500
Answer: Selecting few ids from reactome.db
1
gravatar for willem.ligtenberg
2.6 years ago by
Netherlands
willem.ligtenberg150 wrote:

I would say this is indeed expected behaviour.

A couple of the gene ids that you list there have apparently no pathway associated with them. (At least not that ends up in any of my queries that I use to generate this package)

If that in itself is an error, could you please show me, where you have found a connection for that specific Entrez gene id, to a specific pathway in Reactome. Then I can review my queries and adapt if required.

I myself use mget normally in this case, and there you can specify what it should do if no mapping was found:

mget(x = genes.id, reactomeEXTID2PATHID, ifnotfound = NA)
ADD COMMENTlink written 2.6 years ago by willem.ligtenberg150

Indeed, some genes don't have a pathway associated with them. But select fails for some of them not returning NA, but when I use the whole set it return NAs. Shouldn't always return NA, when I `select` some and when I use `select` with all of them?

ADD REPLYlink written 2.6 years ago by Lluís Revilla Sancho500

There are two competing ideas here. You are saying that you put in real official Entrez Gene IDs, and so shouldn't reactome.db then return a data.frame with NA values?

An alternative viewpoint is to say that unless some of the IDs you have passed in appear to be valid Entrez Gene IDs, a more useful thing to do would be to tell you that none of the IDs appear to be valid Entrez Gene IDs, which is what happens. If we were to do what you suggest, then you could do something like

egids <- c("not", "really", "entrez","gene","IDs","at","all")
select(reactome.db, egids, "REACTOMEID","ENTREZID")
  ENTREZID   REACTOMEID
1   not           <NA>
2   really        <NA>
3   entrez        <NA>
4   gene          <NA>
5   IDs           <NA>
6   at            <NA>
7   all           <NA>

I suppose there are valid arguments for either behavior, but to me they boil down to two general ideas:

  1. We are all grownups here, and so the package should just process input data without too much error checking. If people want to query for random things, they should get NA values back.
  2. Not everybody is fully cognizant of how these annotation db packages work, and it might be helpful to have some error checking. If somebody inputs a set of keys for which there are no matches, that may well be an error on their part, and it would be helpful to let them know.

I tend towards #2, personally, because people do make mistakes and it is helpful to let them know when it appears they have done so.

ADD REPLYlink written 2.6 years ago by James W. MacDonald50k
1

I understand what you mean here, but just to clarify, neither mget, nor the select function is defined in an annotation package. This behaviour comes from AnnotationDbi::select and from BiocGenerics::mget.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by willem.ligtenberg150
Answer: Selecting few ids from reactome.db
0
gravatar for Martin Morgan
2.6 years ago by
Martin Morgan ♦♦ 23k
United States
Martin Morgan ♦♦ 23k wrote:

Although all the ids may be valid ENTREZ ids, they do not appear to be in the version of reactome.db available in the package

> genes.id %in% keys(reactome.db, "ENTREZID")
 [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE  TRUE

filter the ids first

gene.ids <- gene.ids[gene.ids %in% keys(reactome.db, "ENTREZID")

and then query

select(reactome.db, keys = genes.id, keytype = "ENTREZID", columns = "REACTOMEID")
ADD COMMENTlink written 2.6 years ago by Martin Morgan ♦♦ 23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 183 users visited in the last hour