Question: error while making database using Annotationforge package
0
gravatar for kritikamish99
2.8 years ago by
India
kritikamish9910 wrote:

Hello All,

Please bear with me i have one silly question. I using clusterprofiler package for gene ontology study. Because my sample database is not present in given package so i need to make it. For making database i am using annotationforge library this is my command :

setwd(dir = "/home/Psuedo/tmp/")

makeOrgPackageFromNCBI(version="NC_002516.2",author="floyd@scripps.edu",maintainer= "floyd@scripps.edu",outputDir = ".",tax_id = "208964", genus="Pseudomonas" ,species = "Pseudomonas aeruginosa PAO1" )

it will run 3-4 hours after that it will give me error saying

Creating package in ./org.PPseudomonas aeruginosa PAO1.eg.db
chmod: cannot access ‘./org.PPseudomonas’: No such file or directory
chmod: cannot access ‘aeruginosa’: No such file or directory
chmod: cannot access ‘PAO1.eg.db/inst/extdata/org.PPseudomonas’: No such file or directory
chmod: cannot access ‘aeruginosa’: No such file or directory
chmod: cannot access ‘PAO1.eg.sqlite’: No such file or directory
Now deleting temporary database file
complete!
[1] "org.PPseudomonas aeruginosa PAO1.eg.sqlite"
Warning message:
In .makeAnnDbPkg(x, dbfile, dest_dir = dest_dir, no.man = no.man,  :
  chmod 444 ./org.PPseudomonas aeruginosa PAO1.eg.db/inst/extdata/org.PPseudomonas aeruginosa PAO1.eg.sqlite failed

my file have permission for allthings still it giving me error

and these are my output files which are generated

gene2accession.gz           gene_info.gz
gene2go.gz                       idmapping_selected.tab.gz
gene2pubmed.gz              NCBI.sqlite
gene2refseq.gz                 org.PPseudomonas aeruginosa PAO1.eg.db

but it ends giving me error

please help me out as i need this urgently.

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS

locale:
 [1] LC_CTYPE=en_IN.UTF-8       LC_NUMERIC=C               LC_TIME=en_IN.UTF-8       
 [4] LC_COLLATE=en_IN.UTF-8     LC_MONETARY=en_IN.UTF-8    LC_MESSAGES=en_IN.UTF-8   
 [7] LC_PAPER=en_IN.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_IN.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GO.db_3.3.0            RSQLite_1.0.0          DBI_0.4-1              httr_1.2.1            
 [5] biomaRt_2.28.0         RCurl_1.95-4.8         bitops_1.0-6           AnnotationForge_1.14.2
 [9] AnnotationDbi_1.34.4   IRanges_2.6.1          S4Vectors_0.10.2       Biobase_2.32.0        
[13] BiocGenerics_0.18.0   

loaded via a namespace (and not attached):
[1] XML_3.98-1.4       GenomeInfoDb_1.8.3 R6_2.1.2           tools_3.3.1       
>

 

ADD COMMENTlink modified 2.8 years ago by Guido Hooiveld2.4k • written 2.8 years ago by kritikamish9910
Answer: error while making database using Annotationforge package
0
gravatar for Guido Hooiveld
2.8 years ago by
Guido Hooiveld2.4k
Wageningen University, Wageningen, the Netherlands
Guido Hooiveld2.4k wrote:

Hi,

I am not an expert on bacteria, and this is not really an answer to your question [though I noticed you have spaces in the species name], but please note that through the AnnotationHub infrastructure an EntrezGene-based (NCBI) annotation library is available.

 

> library(AnnotationHub)
> hub = AnnotationHub()
snapshotDate(): 2016-07-14
>
> # Query for Pseudomonas
> query(hub, c("Pseudomonas"))
AnnotationHub with 35 records
# snapshotDate(): 2016-07-14
# $dataprovider: NCBI, ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/, Inparanoid8
# $species: Pseudomonas aeruginosa, Pseudomonas aeruginosa_PAO1, Pseudomonas...
# $rdataclass: OrgDb, Inparanoid8Db
# additional mcols(): taxonomyid, genome, description, tags, sourceurl,
#   sourcetype
# retrieve records with, e.g., 'object[["AH10565"]]'

            title                                                
  AH10565 | hom.Pseudomonas_aeruginosa.inp8.sqlite               
  AH12818 | org.Pseudomonas_mendocina_NK-01.eg.sqlite            
  AH12869 | org.Pseudomonas_putida_KT2440.eg.sqlite              
  AH12938 | org.Pseudomonas_syringae_pv._syringae_B728a.eg.sqlite
  AH12940 | org.Pseudomonas_fluorescens_Pf0-1.eg.sqlite          
  ...       ...                                                  
  AH48510 | org.Pseudomonas_aeruginosa_PAO1.eg.sqlite            
  AH48516 | org.Pseudomonas_putida_KT2440.eg.sqlite              
  AH48538 | org.Pseudomonas_syringae_pv._syringae_B728a.eg.sqlite
  AH48582 | org.Pseudomonas_mendocina_ymp.eg.sqlite              
  AH48621 | org.Pseudomonas_stutzeri_A1501.eg.sqlite             
>
>
> # Now see that: AH48510 | org.Pseudomonas_aeruginosa_PAO1.eg.sqlite
>
> org.PA01.eg.db <- hub[["AH48510"]]
>
> # check
> columns(org.PA01.eg.db)#which annotation data can be retrieved?
 [1] "ACCNUM"      "ALIAS"       "ENTREZID"    "EVIDENCE"    "EVIDENCEALL"
 [6] "GENENAME"    "GID"         "GO"          "GOALL"       "ONTOLOGY"   
[11] "ONTOLOGYALL" "PMID"        "REFSEQ"      "SYMBOL"     
> keytypes(org.PA01.eg.db)#which identifiers can be queried with?
 [1] "ACCNUM"      "ALIAS"       "ENTREZID"    "EVIDENCE"    "EVIDENCEALL"
 [6] "GENENAME"    "GID"         "GO"          "GOALL"       "ONTOLOGY"   
[11] "ONTOLOGYALL" "PMID"        "REFSEQ"      "SYMBOL"     
>
> select(org.PA01.eg.db, head(keys(org.PA01.eg.db)), c("SYMBOL", "GENENAME", "GO"))
'select()' returned 1:many mapping between keys and columns
      GID SYMBOL                  GENENAME         GO
1  877569  pyoS5                 pyocin S5 GO:0016021
2  877569  pyoS5                 pyocin S5 GO:0019835
3  877569  pyoS5                 pyocin S5 GO:0050829
4  877570 PA1021       enoyl-CoA hydratase GO:0016853
5  877571 PA0123 transcriptional regulator GO:0003677
6  877571 PA0123 transcriptional regulator GO:0003700
7  877571 PA0123 transcriptional regulator GO:0006351
8  877572 PA0891      hypothetical protein GO:0016788
9  877572 PA0891      hypothetical protein GO:0046872
10 877573 PA1020    acyl-CoA dehydrogenase GO:0050660
11 877573 PA1020    acyl-CoA dehydrogenase GO:0016627
12 877574 PA0440            oxidoreductase GO:0005737
13 877574 PA0440            oxidoreductase GO:0051536
14 877574 PA0440            oxidoreductase GO:0016491
>
>
>
ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by Guido Hooiveld2.4k

Hi Guido thanks for this help

I have one more query suppose i have denovo data then in that the reference will not be available so i can't use clusterprofiler for my denovo samples. Or should i need to customise the annotationhub  specific for my sample

ADD REPLYlink written 2.8 years ago by kritikamish9910

Mmm, I am sorry but I don't understand your question...

ADD REPLYlink written 2.8 years ago by Guido Hooiveld2.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 276 users visited in the last hour