I'm trying to create a custom annotation for Human genes using the package annotationForge and then run an enrichment analysis using the package clusterProfiler. I am able to construct the necessary data frames for the function makeOrgPackage
()
retrieving the information from the Homo.sapiens package of the GO I'm interested in using. My problem comes when I try to run the function makeOrgPackage
()
which returns the following error:
Populating genes table:
genes table filled
Populating gene_info table:
gene_info table filled
Populating go table:
Error in rsqlite_send_query(conn@ptr, statement) :
NOT NULL constraint failed: go.GO
In addition: There were 15 warnings (use warnings() to see them)
I have also noticed that some of the gene ontologies that I'm trying to retrieve from the Homo.sapiens package don't appear (like GO:0099536).
Thank you all.
sessionInfo() R version 3.3.3 (2017-03-06) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: macOS Sierra 10.12.6
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] org.Hsapiens2.eg.db_0.1
[2] AnnotationForge_1.16.1
[3] RamiGO_1.20.0
[4] gsubfn_0.6-6
[5] proto_1.0.0
[6] BiocInstaller_1.24.0
[7] Homo.sapiens_1.3.1
[8] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[9] GO.db_3.4.0
[10] OrganismDbi_1.16.0
[11] GenomicFeatures_1.26.4
[12] GOSemSim_2.0.4
[13] GOplot_1.0.2
[14] gridExtra_2.2.1
[15] ggrepel_0.6.5
[16] sva_3.22.0
[17] genefilter_1.56.0
[18] mgcv_1.8-18
[19] nlme_3.1-131
[20] reshape2_1.4.2
[21] ggdendro_0.1-20
[22] gplots_3.0.1
[23] RColorBrewer_1.1-2
[24] ggpubr_0.1.4
[25] magrittr_1.5
[26] ReactomePA_1.18.1
[27] pathview_1.14.0
[28] org.Hs.eg.db_3.4.0
[29] AnnotationDbi_1.36.2
[30] clusterProfiler_3.2.14
[31] DOSE_3.0.10
[32] biomaRt_2.30.0
[33] oce_0.9-21
[34] gsw_1.0-4
[35] testthat_1.0.2
[36] regionReport_1.8.2
[37] pheatmap_1.0.8
[38] ggplot2_2.2.1
[39] ape_4.1
[40] DESeq2_1.14.1
[41] SummarizedExperiment_1.4.0
[42] Biobase_2.34.0
[43] GenomicRanges_1.26.4
[44] GenomeInfoDb_1.10.3
[45] IRanges_2.8.2
[46] S4Vectors_0.12.2
[47] BiocGenerics_0.20.0
loaded via a namespace (and not attached):
[1] RSQLite_2.0 htmlwidgets_0.9
[3] grid_3.3.3 BiocParallel_1.8.2
[5] munsell_0.4.3 codetools_0.2-15
[7] colorspace_1.3-2 knitr_1.16
[9] KEGGgraph_1.32.0 bit64_0.9-7
[11] rprojroot_1.2 R6_2.2.2
[13] markdown_0.8 locfit_1.5-9.1
[15] bitops_1.0-6 fgsea_1.0.2
[17] assertthat_0.2.0 scales_0.4.1
[19] nnet_7.3-12 derfinder_1.8.5
[21] gtable_0.2.0 rlang_0.1.1
[23] splines_3.3.3 rtracklayer_1.34.2
[25] lazyeval_0.2.0 acepack_1.4.1
[27] checkmate_1.8.3 backports_1.1.0
[29] qvalue_2.6.0 Hmisc_4.0-3
[31] RBGL_1.50.0 tools_3.3.3
[33] tcltk_3.3.3 RCytoscape_1.24.1
[35] Rcpp_0.12.12 plyr_1.8.4
[37] base64enc_0.1-3 zlibbioc_1.20.0
[39] RCurl_1.95-4.8 rpart_4.1-11
[41] bumphunter_1.14.0 GenomicFiles_1.10.3
[43] cluster_2.0.6 data.table_1.10.4
[45] DO.db_2.9 reactome.db_1.58.0
[47] matrixStats_0.52.2 evaluate_0.10.1
[49] xtable_1.8-2 XML_3.98-1.9
[51] tibble_1.3.3 KernSmooth_2.23-15
[53] crayon_1.3.2 htmltools_0.3.6
[55] Formula_1.2-2 tidyr_0.6.3
[57] geneplotter_1.52.0 lubridate_1.6.0
[59] DBI_0.7 MASS_7.3-47
[61] rappdirs_0.3.1 XMLRPC_0.3-0
[63] Matrix_1.2-10 gdata_2.18.0
[65] derfinderHelper_1.8.1 bindr_0.1
[67] igraph_1.1.2 pkgconfig_2.0.1
[69] GenomicAlignments_1.10.1 registry_0.3
[71] RefManageR_0.14.12 foreign_0.8-69
[73] xml2_1.1.1 foreach_1.4.3
[75] annotate_1.52.1 rngtools_1.2.4
[77] DEFormats_1.2.0 pkgmaker_0.22
[79] XVector_0.14.1 bibtex_0.4.2
[81] knitcitations_1.0.8 doRNG_1.6.6
[83] stringr_1.2.0 VariantAnnotation_1.20.3
[85] digest_0.6.12 graph_1.52.0
[87] Biostrings_2.42.1 rmarkdown_1.6
[89] fastmatch_1.1-0 htmlTable_1.9
[91] edgeR_3.16.5 Rsamtools_1.26.2
[93] gtools_3.5.0 graphite_1.20.1
[95] knitrBootstrap_1.0.1 jsonlite_1.5
[97] bindrcpp_0.2 limma_3.30.13
[99] BSgenome_1.42.0 lattice_0.20-35
[101] KEGGREST_1.14.1 httr_1.2.1
[103] survival_2.41-3 glue_1.1.1
[105] png_0.1-7 iterators_1.0.8
[107] bit_1.1-12 Rgraphviz_2.18.0
[109] stringi_1.1.5 blob_1.1.0
[111] latticeExtra_0.6-28 caTools_1.17.1
[113] memoise_1.1.0 dplyr_0.7.2
It seems you are trying to do some fancy stuff, but it's not clear why. Is there some reason that the existing annotation packages are not useful to you?
I want to remove certain genes from the annotation to see if the signal I have is due to specific genes or it's all the group that is involved
Rather than trying to rebuild a new package, why don't you use the data.frame format that is described in the clusterProfiler vignette? It would be simple to do what you want that way.