Hi all, I'm trying to create an organism package (I've did it in the past with no problems), but now I'm experiecing a weird error.
I've used the example in the vignette in order to explain everythin as best as posible:
I copy and paste the code form the vignette:
library("AnnotationForge")
## Makes an organism package for Zebra Finch data.frames:
finchFile <- system.file("extdata","finch_info.txt",package="AnnotationForge")
finch <- read.table(finchFile,sep="\t")
## not that this is how it should always be, but that it *could* be this way.
fSym <- finch[,c(2,3,9)]
fSym <- fSym[fSym[,2]!="-",]
fSym <- fSym[fSym[,3]!="-",]
colnames(fSym) <- c("GID","SYMBOL","GENENAME")
fChr <- finch[,c(2,7)]
fChr <- fChr[fChr[,2]!="-",]
colnames(fChr) <- c("GID","CHROMOSOME")
finchGOFile <- system.file("extdata","GO_finch.txt",package="AnnotationForge")
fGO <- read.table(finchGOFile,sep="\t")
fGO <- fGO[fGO[,2]!="",]
fGO <- fGO[fGO[,3]!="",]
colnames(fGO) <- c("GID","GO","EVIDENCE")
makeOrgPackage(gene_info=fSym, chromosome=fChr, go=fGO,
               version="0.1",
               maintainer="Some One <so@someplace.org>",
               author="Some One <so@someplace.org>",
               outputDir = ".",
               tax_id="59729",
               genus="Taeniopygia",
               species="guttata",
               goTable="go",
               verbose=T)
Once it finished i obtain the folllowing:
Creating package in ./org.Tguttata.eg.db 
Now deleting temporary database file
[1] "./org.Tguttata.eg.db"
There were 50 or more warnings (use warnings() to see the first 50)
The warnings are:
Warning messages:
1: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
2: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
3: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
4: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
5: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
6: In result_fetch(res@ptr, n = n) :
  SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
7: In result_fetch(res@ptr, n = n) :
Then I load the library and I try some keytype:
library(org.Tguttata.eg.db)
head( keys(org.Tguttata.eg.db) )
ls("package:org.Tguttata.eg.db")
columns(org.Tguttata.eg.db)
head(keys(org.Tguttata.eg.db, keytype="CHROMOSOME"))
head(keys(org.Tguttata.eg.db, keytype="GID"))
But once I use the GO or the evidence I obtnain the following errors:
> columns(org.Tguttata.eg.db)
 [1] "CHROMOSOME"  "EVIDENCE"    "EVIDENCEALL" "GENENAME"    "GID"         "GO"          "GOALL"       "ONTOLOGY"    "ONTOLOGYALL" "SYMBOL"     
> head(keys(org.Tguttata.eg.db, keytype="GO"))
Error in .deriveTableNameFromField(field = keytype, x) : 
  Two fields in the source DB have the same name.
> head(keys(org.Tguttata.eg.db, keytype="EVIDENCE"))
Error in .deriveTableNameFromField(field = keytype, x) : 
  Two fields in the source DB have the same name.
So I check the sqlite db:
>bash$>sqlite3 ./org.Psp.PAO1.eg.db/inst/extdata/org.Psp.PAO1.eg.sqlite
SQLite version 3.24.0 2018-06-04 14:10:15
Enter ".help" for usage hints.
sqlite> .tables
chromosome    go            go_bp_all     go_mf         map_metadata
gene_info     go_all        go_cc         go_mf_all     metadata    
genes         go_bp         go_cc_all     map_counts  
sqlite> .schema go
CREATE TABLE go  (
            _id INTEGER NOT NULL,                         -- REFERENCES genes
         GO  VARCHAR( 25 ) NOT NULL,    -- data
       EVIDENCE  VARCHAR( 25 ) NOT NULL,    -- data
       ONTOLOGY  VARCHAR( 25 ) NOT NULL,    -- data 
        FOREIGN KEY (_id)
        REFERENCES genes (_id));
CREATE INDEX go_GO_ind ON go (GO);
CREATE INDEX go_EVIDENCE_ind ON go (EVIDENCE);
CREATE INDEX go_ONTOLOGY_ind ON go (ONTOLOGY);
CREATE INDEX go__id_ind ON go (_id);
sqlite> select * from go limit 5;
1|GO:0003677|IEA|MF
1|GO:0003688|IEA|MF
1|GO:0006260|IEA|BP
1|GO:0043565|IEA|MF
2|GO:0006271|IEA|BP
sqlite>
I've hecked the code and it seems that the error is related with this function:
## Keys method ##
.deriveTableNameFromField <- function(field, x){
  con <- dbconn(x)
  tables <- .getDataTables(con)
  colTabs <- lapply(tables, FUN=RSQLite::dbListFields, con=con)
  m <- unlist2(lapply(colTabs, match, field))
  tab <- names(m)[!is.na(m)]
  if(length(tab) > 1){stop("Two fields in the source DB have the same name.")}
  if(length(tab) == 0){stop("Did not find a field in the source DB.")}
  tab
}
But I'm not sure how to proceed ...
So I hope that you guys could help me cause I'have done everythin I could with no success :(
Thanks in advance !!!!
Here is my SessionInfo()
> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6
Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] org.Tguttata.eg.db_0.1 AnnotationForge_1.28.0 AnnotationDbi_1.48.0   IRanges_2.20.2         S4Vectors_0.24.4       Biobase_2.46.0         BiocGenerics_0.32.0   
loaded via a namespace (and not attached):
 [1] Rcpp_1.0.3      GO.db_3.10.0    XML_3.99-0.3    digest_0.6.23   bitops_1.0-6    DBI_1.1.0       RSQLite_2.2.0   rlang_0.4.4     blob_1.2.1      vctrs_0.2.2    
[11] tools_3.6.2     bit64_0.9-7     RCurl_1.98-1.1  bit_1.1-15.1    yaml_2.2.0      compiler_3.6.2  pkgconfig_2.0.3 memoise_1.1.0
                    
                
                
This is remarkably similar to another recent issue - are you the same person? - https://support.bioconductor.org/p/130238/#130265
I provide an indirect solution in my answer.
Hi, thanks for your reply. I'm not Cei. I was searching for a while for similar errors and I could not find anything ....
The problem with your solition is that you "remove" de GO column which is the one I need. I've created the db package in order to use clusterprofiler or similar, to perform GO enrichment in a RNASeq study
You could find a combination of columns to retain / remove such that GO is retained. This is just a simple fix, though.