Hi all,
Is there an easy way to turn an OrgDb object from AnnotationHub into a package? While they can be used with the nifty select(), keytypes(), etc. accessor functions, and some functions like goana() can use them fine, other functions that attempt to load it as a package end up throwing errors. I did some searching and there are brief mentions of this issue here (A: Error in makeOrgPackageFromNCBI for Medicago truncatula) and here (how to use "non-standard" species for KEGG / GO analysis in limma?) but no answers . Is the best answer currently to use AnnotationForge and makeOrgPackageFromNCBI()?
Thanks,
Jenny
> library(AnnotationHub) Loading required package: BiocGenerics Loading required package: parallel #lines removed > library(pathview) Loading required package: org.Hs.eg.db Loading required package: AnnotationDbi #lines removed > ah <- AnnotationHub() snapshotDate(): 2016-10-11 > query(ah, "Nannospalax") AnnotationHub with 1 record # snapshotDate(): 2016-10-11 # names(): AH52167 # $dataprovider: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/ # $species: Nannospalax galili # $rdataclass: OrgDb # $title: org.Nannospalax_galili.eg.sqlite # $description: NCBI gene ID based annotations about Nannospalax g... # $taxonomyid: 1026970 # $genome: NCBI genomes # $sourcetype: NCBI/UniProt # $sourceurl: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/, ftp://ftp.uni... # $sourcelastmodifieddate: NA # $sourcesize: NA # $tags: c("NCBI", "Gene", "Annotation") # retrieve record with 'object[["AH52167"]]' > org.Ng.eg.db <- ah[["AH52167"]] loading from cache ‘C:/Users/drnevich/Documents/AppData/.AnnotationHub/58905’ Warning message: vfs customization not available on this platform. Ignoring value: vfs = unix-none > data(korg) > #Need to add spalax to pathview's korg database cause it's not in for some reason > korg <- rbind(korg, c("ngi","Nannospalax galili", "spalax", "1", "103724393","103724393")) > pathview(gene.data = keys(org.Ng.eg.db, keytype = "ENTREZID")[1:1000], + pathway.id = "04080", kegg.dir = "BasePathwayMaps", + species = "ngi", out.suffix = "test", kegg.native = T, + same.layer = F, gene.annotpkg = org.Ng.eg.db) Info: Downloading xml files for ngi04080, 1/1 pathways.. Info: Downloading png files for ngi04080, 1/1 pathways.. Error in !pkg.on : invalid argument type In addition: Warning message: In is.na(gene.annotpkg) : is.na() applied to non-(list or vector) of type 'S4' > sessionInfo() R version 3.3.2 (2016-10-31) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats4 parallel stats graphics grDevices utils [7] datasets methods base other attached packages: [1] pathview_1.14.0 org.Hs.eg.db_3.4.0 AnnotationDbi_1.36.1 [4] IRanges_2.8.1 S4Vectors_0.12.1 Biobase_2.34.0 [7] AnnotationHub_2.6.4 BiocGenerics_0.20.0 loaded via a namespace (and not attached): [1] graph_1.52.0 Rcpp_0.12.9 [3] KEGGgraph_1.32.0 XVector_0.14.0 [5] zlibbioc_1.20.0 xtable_1.8-2 [7] R6_2.2.0 httr_1.2.1 [9] tools_3.3.2 grid_3.3.2 [11] png_0.1-7 DBI_0.5-1 [13] htmltools_0.3.5 yaml_2.1.14 [15] digest_0.6.12 interactiveDisplayBase_1.12.0 [17] shiny_1.0.0 Rgraphviz_2.18.0 [19] curl_2.3 KEGGREST_1.14.0 [21] memoise_1.0.0 RSQLite_1.1-2 [23] mime_0.5 BiocInstaller_1.24.0 [25] Biostrings_2.42.1 XML_3.98-1.5 [27] httpuv_1.3.3
Thanks, Jim - it worked!
when i following these codes i found error given below: Installing package into ‘/home/bioinfo/R/x86_64-pc-linux-gnu-library/4.2’ (as ‘lib’ is unspecified) Warning: invalid package ‘./org.Oz.eg.db’ Error: ERROR: no packages specified Warning message: In install.packages("./org.Oz.eg.db", type = "source", repos = NULL) : installation of package ‘./org.Oz.eg.db’ had non-zero exit status
It would help if you show the exact codes you used...
Hello Jim,
I have species Plasmopara halstedii which I am interested to make package i.e. org.Ph.eg.db from Annotationhub but got an error such as
> ah <- AnnotationHub()
snapshotDate(): 2017-10-27
> query(ah, "halstedii")
AnnotationHub with 0 records
# snapshotDate(): 2017-10-27
Warning message:
call dbDisconnect() when finished working with a connection
Suggest way-out.
Hello Jim,
I have species Plasmopara halstedii which I am interested to make package i.e. org.Ph.eg.db from Annotationhub but got an error such as
> ah <- AnnotationHub()
snapshotDate(): 2017-10-27
> query(ah, "halstedii")
AnnotationHub with 0 records
# snapshotDate(): 2017-10-27
Warning message:
call dbDisconnect() when finished working with a connection
Suggest way-out.
You got a warning, because evidently you are using an old R/Bioc installation. I don't get the warning:
But the fact that I get zero records indicates that there isn't an OrgDb on AnnotationHub for this species. In addition, NCBI says there are only 19 genes for this virus, and about are just partial cds. So not a well annotated organism, so far as NCBI is concerned.
Thanks!
So if I have transcriptomic data of this species and using its genome and annotation as reference would make generate errors due to partial cds?
Please let me know the solution for reference genome of plasmopara halstedii
There are different levels of annotation... NCBI does have a genome for Plasmopara halstedii and it has 15,469 predicted proteins (https://www.ncbi.nlm.nih.gov/genome/?term=Plasmopara+halstedii). So there should be the locations for the exons making up these proteins in the .gff file, which is one level of annotation. If you click on the "protein count: 15469" link (https://www.ncbi.nlm.nih.gov/genome/proteins/42828?genome_assembly_id=263864) there is some information on what these proteins are, so that level of annotation is also available, to a small degree. What is not available is sequenced cDNAs, although 15,469 genes in a virus seem way too high - I have no idea what is going on! Regardless, an OrgDb package for this species is not available through AnnotationHub. Hopefully the protein names are contained in the gff file and you can pull them out from there.
Hello Jim,
AnnotationForge has created database of P.halstedii from NCBI successfully with latest R 3.5.1, biomaRt, GenomeInfoDB libraries.
Thanks for help!