Search
Question: Uniprot.ws getting Function [CC]
0
gravatar for schmid10
6 months ago by
schmid100
schmid100 wrote:

Hi,

I am using Uniprot.ws, but I can't get the full information I would like to have.

I would like to retrieve a short discription about the protein, which is called "Function [CC]" in the resulting table when I use UniProt's ID mapping.

Are there other possibilites than using ID mapping from Uniprot direclty?

Thanks a lot.

ADD COMMENTlink modified 5 months ago • written 6 months ago by schmid100

See column Function [CC] here.

https://www.uniprot.org/help/uniprotkb_column_names

ADD REPLYlink written 6 months ago by l.nilse0
0
gravatar for James W. MacDonald
6 months ago by
United States
James W. MacDonald48k wrote:

The UniProt.ws package only provides a subset of the data you can get directly from the UniProt website. You can see what's available by loading the package and then calling the columns function:

> ws <- UniProt.ws()
> columns(ws)
  [1] "3D"                         "AARHUS/GHENT-2DPAGE"       
  [3] "AGD"                        "ALLERGOME"                 
  [5] "ARACHNOSERVER"              "BIOCYC"                    
  [7] "CGD"                        "CITATION"                  
  [9] "CLEANEX"                    "CLUSTERS"                  
 [11] "COMMENTS"                   "CONOSERVER"                
 [13] "CYGD"                       "DATABASE(PDB)"             
 [15] "DATABASE(PFAM)"             "DICTYBASE"                 
 [17] "DIP"                        "DISPROT"                   
 [19] "DMDM"                       "DNASU"                     
 [21] "DOMAIN"                     "DOMAINS"                   
 [23] "DRUGBANK"                   "EC"                        
 [25] "ECHOBASE"                   "ECO2DBASE"                 
 [27] "ECOGENE"                    "EGGNOG"                    
 [29] "EMBL/GENBANK/DDBJ"          "EMBL/GENBANK/DDBJ_CDS"     
 [31] "ENSEMBL"                    "ENSEMBL_GENOMES"           
 [33] "ENSEMBL_GENOMES PROTEIN"    "ENSEMBL_GENOMES TRANSCRIPT"
 [35] "ENSEMBL_PROTEIN"            "ENSEMBL_TRANSCRIPT"        
 [37] "ENTREZ_GENE"                "ENTRY-NAME"                
 [39] "EUHCVDB"                    "EUPATHDB"                  
 [41] "EXISTENCE"                  "FAMILIES"                  
 [43] "FEATURES"                   "FLYBASE"                   
 [45] "GENECARDS"                  "GENEFARM"                  
 [47] "GENES"                      "GENETREE"                  
 [49] "GENOLIST"                   "GENOMERNAI"                
 [51] "GERMONLINE"                 "GI_NUMBER*"                
 [53] "GO"                         "GO-ID"                     
 [55] "HGNC"                       "H-INVDB"                   
 [57] "HOGENOM"                    "HPA"                       
 [59] "HSSP"                       "ID"                        
 [61] "INTERACTOR"                 "INTERPRO"                  
 [63] "KEGG"                       "KEYWORD-ID"                
 [65] "KEYWORDS"                   "KO"                        
 [67] "LAST-MODIFIED"              "LEGIOLIST"                 
 [69] "LENGTH"                     "LEPROMA"                   
 [71] "MAIZEGDB"                   "MEROPS"                    
 [73] "MGI"                        "MIM"                       
 [75] "MINT"                       "NEXTBIO"                   
 [77] "NEXTPROT"                   "OMA"                       
 [79] "ORGANISM"                   "ORGANISM-ID"               
 [81] "ORPHANET"                   "ORTHODB"                   
 [83] "PATHWAY"                    "PATRIC"                    
 [85] "PDB"                        "PEROXIBASE"                
 [87] "PHARMGKB"                   "PHOSSITE"                  
 [89] "PIR"                        "POMBASE"                   
 [91] "PPTASEDB"                   "PROTCLUSTDB"               
 [93] "PROTEIN-NAMES"              "PSEUDOCAP"                 
 [95] "REACTOME"                   "REBASE"                    
 <snip>

The link provided in the comment by l.nilse will provide you with the name you should be looking for, which in this case is FUNCTION. I don't believe any of the data under the Function header are available through UniProt.ws.

ADD COMMENTlink written 6 months ago by James W. MacDonald48k

Thanks, James. - I filed a features request. https://github.com/Bioconductor/UniProt.ws/issues/1

ADD REPLYlink modified 6 months ago • written 6 months ago by l.nilse0

I'll take a look at adding that. Right now you can return the comments column, but it just tells you what type of comments there are, without returning the individual columns themselves:

> ws <- UniProt.ws()
select(ws, "P23434", c("COMMENTS"), "UNIPROTKB")
> 
Getting extra data for P23434
'select()' returned 1:1 mapping between keys and columns
  UNIPROTKB
1    P23434
                                                                                                                            COMMENTS
1 Cofactor (1); Function (1); Involvement in disease (1); Sequence similarities (1); Subcellular location (1); Subunit structure (1)
> 
ADD REPLYlink written 6 months ago by James W. MacDonald48k
0
gravatar for James W. MacDonald
6 months ago by
United States
James W. MacDonald48k wrote:

OK, fixed now. This is in the devel version of Bioconductor, as the release version is frozen. The updates will propagate through the build system in the next couple of days.

> select(ws, "P23434", "FUNCTION","UNIPROTKB")
Getting extra data for P23434
'select()' returned 1:1 mapping between keys and columns
  UNIPROTKB
1    P23434
                                                                                                                                                                                                                     FUNCTION
1 FUNCTION: The glycine cleavage system catalyzes the degradation of glycine. The H protein (GCSH) shuttles the methylamine group of glycine from the P protein (GLDC) to the T protein (GCST). {ECO:0000269|PubMed:1671321}.
> sessionInfo()
R Under development (unstable) (2018-02-01 r74194)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

Matrix products: default
BLAS: /data/oldR/R-devel/lib64/R/lib/libRblas.so
LAPACK: /data/oldR/R-devel/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] UniProt.ws_2.19.2   BiocGenerics_0.25.3 RCurl_1.95-4.10    
[4] bitops_1.0-6        RSQLite_2.1.0      

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.16         AnnotationDbi_1.41.4 magrittr_1.5        
 [4] bindr_0.1.1          rappdirs_0.3.1       IRanges_2.13.28     
 [7] bit_1.1-12           R6_2.2.2             rlang_0.2.0         
[10] httr_1.3.1           blob_1.1.1           dplyr_0.7.4         
[13] tools_3.5.0          Biobase_2.39.2       DBI_0.8             
[16] dbplyr_1.2.1         bit64_0.9-7          digest_0.6.15       
[19] assertthat_0.2.0     tibble_1.4.2         bindrcpp_0.2.2      
[22] S4Vectors_0.17.41    glue_1.2.0           memoise_1.1.0       
[25] BiocFileCache_1.3.42 pillar_1.2.1         compiler_3.5.0      
[28] stats4_3.5.0         pkgconfig_2.0.1     
> 
ADD COMMENTlink written 6 months ago by James W. MacDonald48k

Great. Thanks for the fix, James. We will test and then close the GitHub issue.

ADD REPLYlink written 6 months ago by l.nilse0
0
gravatar for schmid10
5 months ago by
schmid100
schmid100 wrote:

Hi everyone,

thank you very much for all your help and adding the "Function" column to the package.

Regards

ADD COMMENTlink written 5 months ago by schmid100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 240 users visited in the last hour