Question

makeOrgPackage Error in .deriveTableNameFromField(field = keytype, x) : Two fields in the source DB have the same name

0

Entering edit mode

tzaquin • 0

@tzaquin-22772

Last seen 4.2 years ago

Hi all, hope you might help me. I'm not very computer savvy, yet I try to create an org.db as it didn't look too hard. And in fact, 3 days ago I was able to do it, but then I tried to do another one and I got this error: .deriveTableNameFromField(field = keytype, x) : Two fields in the source DB have the same name. I thought it might be because I mistakingly did not have unique column names, but I couldn't find a problem with it. Then I went back to the authors' examples:

> setwd("~/test/")
> library(AnnotationForge)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply,
    parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval,
    evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order,
    paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
    table, tapply, union, unique, unsplit, which, which.max, which.min

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:

    expand.grid


Attaching package: ‘IRanges’

The following object is masked from ‘package:grDevices’:

    windows

> 
> finchFile <- system.file("extdata","finch_info.txt",
+                          package="AnnotationForge")
> finch <- read.table(finchFile,sep="\t")
> 
> fSym <- finch[,c(2,3,9)]
> fSym <- fSym[fSym[,2]!="-",]
> fSym <- fSym[fSym[,3]!="-",]
> colnames(fSym) <- c("GID","SYMBOL","GENENAME")
> 
> fChr <- finch[,c(2,7)]
> fChr <- fChr[fChr[,2]!="-",]
> colnames(fChr) <- c("GID","CHROMOSOME")
> 
> finchGOFile <- system.file("extdata","GO_finch.txt",
+                            package="AnnotationForge")
> fGO <- read.table(finchGOFile,sep="\t")
> fGO <- fGO[fGO[,2]!="",]
> fGO <- fGO[fGO[,3]!="",]
> colnames(fGO) <- c("GID","GO","EVIDENCE")
> 
> makeOrgPackage(gene_info=fSym, chromosome=fChr, go=fGO,
+                version="0.1",
+                maintainer="Some One <so@someplace.org>",
+                author="Some One <so@someplace.org>",
+                outputDir = ".",
+                tax_id="59729",
+                genus="Taeniopygia",
+                species="guttata",
+                goTable="go")
Populating genes table:
genes table filled
Populating gene_info table:
gene_info table filled
Populating chromosome table:
chromosome table filled
Populating go table:
go table filled
table metadata filled

'select()' returned many:1 mapping between keys and columns
Dropping GO IDs that are too new for the current GO.db
Populating go table:
go table filled
Populating go_bp table:
go_bp table filled
Populating go_cc table:
go_cc table filled
Populating go_mf table:
go_mf table filled
'select()' returned many:1 mapping between keys and columns
Populating go_bp_all table:
go_bp_all table filled
Populating go_cc_all table:
go_cc_all table filled
Populating go_mf_all table:
go_mf_all table filled
Populating go_all table:
go_all table filled
Creating package in ./org.Tguttata.eg.db 
Now deleting temporary database file
[1] "./org.Tguttata.eg.db"
There were 50 or more warnings (use warnings() to see the first 50)
> install.packages("./org.Tguttata.eg.db", repos=NULL, type="source")
* installing *source* package 'org.Tguttata.eg.db' ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
  converting help for package 'org.Tguttata.eg.db'
    finding HTML links ... done
    org.Tguttata.egBASE                     html  
    org.Tguttata.egORGANISM                 html  
    org.Tguttata.eg_dbconn                  html  
** building package indices
** testing if installed package can be loaded from temporary location
*** arch - i386
*** arch - x64
** testing if installed package can be loaded from final location
*** arch - i386
*** arch - x64
** testing if installed package keeps a record of temporary installation path
* DONE (org.Tguttata.eg.db)
> library(org.Tguttata.eg.db)

> keytypes(org.Tguttata.eg.db)
 [1] "CHROMOSOME"  "EVIDENCE"    "EVIDENCEALL" "GENENAME"    "GID"         "GO"          "GOALL"       "ONTOLOGY"   
 [9] "ONTOLOGYALL" "SYMBOL"     
> keys(org.Tguttata.eg.db,"GO")
Error in .deriveTableNameFromField(field = keytype, x) : 
  Two fields in the source DB have the same name.

if anyone can help, I will ver much appreciate it.

Tal

software error go AnnotationForge AnnotationDbi • 799 views

ADD COMMENT • link updated 4.2 years ago by James W. MacDonald 65k • written 4.2 years ago by tzaquin • 0

0

Entering edit mode

forgot to add the trace back

traceback() 7: stop("Two fields in the source DB have the same name.") 6: .deriveTableNameFromField(field = keytype, x) 5: .noSchemaKeys(x, keytype) 4: .keys(x, keytype) 3: smartKeys(x = x, keytype = keytype, ..., FUN = .keys) 2: keys(org.Tguttata.eg.db, "GO") 1: keys(org.Tguttata.eg.db, "GO")

So, I'm guessing there is some kind of duplication? but then, why is that?

Tal

ADD REPLY • link 4.2 years ago tzaquin • 0

score 0 · Answer 1 · 2020-01-27

You are creating a package that already exists, so I would recommend checking on the AnnotationHub first:

> library(AnnotationHub)

> hub <- AnnotationHub()
> query(hub, c("guttata","orgdb"))

AnnotationHub with 4 records
# snapshotDate(): 2019-10-29 
# $dataprovider: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Taenopygia guttata, Taeniopygia guttata, Poephila guttata, Eryth...
# $rdataclass: OrgDb
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["AH76121"]]' 

            title                            
  AH76121 | org.Erythranthe_guttata.eg.sqlite
  AH76438 | org.Poephila_guttata.eg.sqlite   
  AH76439 | org.Taeniopygia_guttata.eg.sqlite
  AH76440 | org.Taenopygia_guttata.eg.sqlite  ## Typo in the name, hence two of these?

> orgdb <- hub[["AH76440"]]

> orgdb
OrgDb object:
| DBSCHEMAVERSION: 2.1
| DBSCHEMA: NOSCHEMA_DB
| ORGANISM: Taenopygia guttata
| SPECIES: Taenopygia guttata
| CENTRALID: GID
| Taxonomy ID: 59729
| Db type: OrgDb
| Supporting package: AnnotationDbi

Please see: help('select') for usage information
> columns(orgdb)
 [1] "ACCNUM"      "ALIAS"       "CHR"         "ENSEMBL"     "ENTREZID"   
 [6] "EVIDENCE"    "EVIDENCEALL" "GENENAME"    "GID"         "GO"         
[11] "GOALL"       "ONTOLOGY"    "ONTOLOGYALL" "PMID"        "REFSEQ"     
[16] "SYMBOL"

There is a problem getting the keys for the 'GO' column, due to an internal function that tries to figure out what table you want and fails. That needs to be fixed, but I am not sure right now how that can be easily accomplished. But maybe you were just checking something and don't need the GO keys?

You can get the GO keys using GOALL:

> head(keys(orgdb, "GOALL"))
[1] "GO:0000003" "GO:0000009" "GO:0000012" "GO:0000015" "GO:0000018"
[6] "GO:0000022"

And if you are doing any GO hypergeometric analyses you want GOALL rather than GO anyway.