Question: converting UniProt to SGD ids using UniProt.wd
0
gravatar for Joseph Barry
4.9 years ago by
Joseph Barry160
Dana-Farber Cancer Institute, Boston, USA
Joseph Barry160 wrote:

I would like to use UniProt.ws to convert UniProt ids to SGD ids for Saccharomyces cerevisiae but my attempts so far have resulted in the error:

Error in .select(x, keys, columns, keytype) :
  No data is available for the keys provided.

Here is a minimal example, where I attempt to convert "I2HB52". The expected answer is "YBR056W-A" (see http://www.uniprot.org/uniprot/I2HB52 ).

library(UniProt.ws)
taxId(UniProt.ws) <- 4932
species(UniProt.ws)
res <- select(x=UniProt.ws, keys="I2HB52", columns="SGD", keytype="UNIPROTKB")

Is this particular conversion currently possible using UniProt.wd? Different choices for 'columns' also result in the same error.

Thanks in advance,

Joseph Barry

Session Info:

> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)

locale:
[1] en_IE.UTF-8/en_IE.UTF-8/en_IE.UTF-8/C/en_IE.UTF-8/en_IE.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] UniProt.ws_2.6.0 RCurl_1.95-4.5   bitops_1.0-6     RSQLite_1.0.0   
[5] DBI_0.3.1       

loaded via a namespace (and not attached):
[1] AnnotationDbi_1.28.1 Biobase_2.26.0       BiocGenerics_0.12.1
[4] GenomeInfoDb_1.2.4   IRanges_2.0.1        parallel_3.1.2      
[7] S4Vectors_0.4.0      stats4_3.1.2         tools_3.1.2       

 

 

uniprot.ws ensembl uniprot • 1.2k views
ADD COMMENTlink modified 4.9 years ago by Marc Carlson7.2k • written 4.9 years ago by Joseph Barry160
Answer: converting UniProt to SGD ids using UniProt.wd
0
gravatar for James W. MacDonald
4.9 years ago by
United States
James W. MacDonald51k wrote:

The link you have above is for a particular strain of yeast, but you are selecting the TaxonId for 'regular' yeast.

> library(UniProt.ws)
Loading required package: RSQLite
Loading required package: DBI
Loading required package: RCurl
Loading required package: bitops
> availableUniprotSpecies(pattern="cerevisiae")
   taxon ID                                                      Species name
1     11008                                Saccharomyces cerevisiae virus L-A
2     42478                               Saccharomyces cerevisiae virus L-BC
3     12450                          Saccharomyces cerevisiae killer virus M1
4    285006                         Saccharomyces cerevisiae (strain RM11-1a)
5    574961                          Saccharomyces cerevisiae (strain JAY291)
6    545124                        Saccharomyces cerevisiae (strain AWRI1631)
7    307796                          Saccharomyces cerevisiae (strain YJM789)
8    643680 Saccharomyces cerevisiae (strain Lalvin EC1118 / Prise de mousse)
9    764097                         Saccharomyces cerevisiae (strain AWRI796)
10   764102                        Saccharomyces cerevisiae (strain FostersB)
11   721032      Saccharomyces cerevisiae (strain Kyokai no. 7 / NBRC 101557)
12   764098                     Saccharomyces cerevisiae (strain Lalvin QA23)
13   764101                        Saccharomyces cerevisiae (strain FostersO)
14   559292             Saccharomyces cerevisiae (strain ATCC 204508 / S288c)
15   764099                          Saccharomyces cerevisiae (strain VIN 13)
16     4932                                          Saccharomyces cerevisiae
17   764100                   Saccharomyces cerevisiae (strain Zymaflore VL3)
> taxId(UniProt.ws) <- 559292
> res <- select(x=UniProt.ws, keys="I2HB52", columns="SGD", keytype="UNIPROTKB")
Getting mapping data for I2HB52 ... and SGD_ID
> res
  UNIPROTKB        SGD
1    I2HB52 S000028736
> taxId(UniProt.ws) <- 4932
> res <- select(x=UniProt.ws, keys="I2HB52", columns="SGD", keytype="UNIPROTKB")
Error in .select(x, keys, columns, keytype) :
  No data is available for the keys provided.

 

ADD COMMENTlink written 4.9 years ago by James W. MacDonald51k
Answer: converting UniProt to SGD ids using UniProt.wd
0
gravatar for Joseph Barry
4.9 years ago by
Joseph Barry160
Dana-Farber Cancer Institute, Boston, USA
Joseph Barry160 wrote:

Hi James,

Great, thanks a lot. Works like a charm now.

Best, Joseph

ADD COMMENTlink written 4.9 years ago by Joseph Barry160
Answer: converting UniProt to SGD ids using UniProt.wd
0
gravatar for Joseph Barry
4.9 years ago by
Joseph Barry160
Dana-Farber Cancer Institute, Boston, USA
Joseph Barry160 wrote:

On a related note, I found somewhat strange NA behaviour. If one includes "ENSEMBL" in the "columns" vector, which returns NA, all other columns also switch to NA when returned. I guess this is not desirable behaviour for most users.

> taxId(UniProt.ws) <- 559292
> species(UniProt.ws)
[1] "Saccharomyces cerevisiae (strain ATCC 204508 / S288c)"
> res <- select(x=UniProt.ws, keys="I2HB52", columns=c("SGD", "SEQUENCE"), keytype="UNIPROTKB")
Getting mapping data for I2HB52 ... and SGD_ID
Getting extra data for I2HB52 NA NA etc
> print(res)
  UNIPROTKB        SGD
1    I2HB52 S000028736
                                                            SEQUENCE
1 MRHQYYQPQPMYYQPQPQPIYIQQGPPPPRNDCCCCCNCGDCCSAIANVLCCLCLIDLCCSCAGGM
> res <- select(x=UniProt.ws, keys="I2HB52", columns=c("SGD", "SEQUENCE", "ENSEMBL"), keytype="UNIPROTKB")
Getting mapping data for I2HB52 ... and ENSEMBL_ID
Getting mapping data for I2HB52 ... and SGD_ID
Getting extra data for I2HB52 NA NA etc
> print(res)
  UNIPROTKB  SGD SEQUENCE ENSEMBL
1    I2HB52 <NA>     <NA>    <NA>


 

ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by Joseph Barry160

I made a small change to the devel version of UniProt.ws, and it now works:

> res <- select(x=UniProt.ws, keys="I2HB52", columns=c("SGD", "SEQUENCE", "ENSEMBL"), keytype="UNIPROTKB")
Getting mapping data for I2HB52 ... and ENSEMBL_ID
Getting mapping data for I2HB52 ... and SGD_ID
Getting extra data for I2HB52 NA NA etc
> res
  UNIPROTKB        SGD
1    I2HB52 S000028736
                                                            SEQUENCE ENSEMBL
1 MRHQYYQPQPMYYQPQPQPIYIQQGPPPPRNDCCCCCNCGDCCSAIANVLCCLCLIDLCCSCAGGM    <NA>

I'll check with Marc Carlson about adding this change to the package.

ADD REPLYlink written 4.9 years ago by James W. MacDonald51k
Answer: converting UniProt to SGD ids using UniProt.wd
0
gravatar for Marc Carlson
4.9 years ago by
Marc Carlson7.2k
United States
Marc Carlson7.2k wrote:

That is indeed a safe looking change.

I have checked in the change proposed.  Thanks for the bug fix! 

You people are the best,

 

 Marc

ADD COMMENTlink written 4.9 years ago by Marc Carlson7.2k

Joseph-

Note that Marc has checked the change into the devel repository, and this should propagate to the download server in the next day or so. If you want to use the updated version you will need to use a devel version of R and BioC.

A hypothetical alternative, if you don't want to use the devel version (hypothetical to me that is, as I don't use MacOS for any real work) is to download the source tarball, unzip it, and then with your favorite editor open the file UniProt.ws/R/methods-select.R

Scroll down to the function .getUPMappata():

.getUPMappdata <- function(colMappers, keys){
  ## get a list of mapping results (as data.frames)
  res <- lapply(colMappers, FUN=mapUniprot, from="ACC+ID", query=keys)
  ## Them merge all these mappings together based on UniProt.
  .mergeList(res, joinType="left")
}

And change that last line to read

.mergeList(res, joinType="all")

then save the file. Since this package doesn't have any compiled code, I believe you can then start R, change the working directory to wherever you put the UniProt.ws, and then do

install.packages("UniProt.ws", type = "source", repos = NULL)

and then you should be good to go.

ADD REPLYlink written 4.9 years ago by James W. MacDonald51k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 479 users visited in the last hour