Question: topGO::annFUN.org() problem due to case-sensitivity of OrgDb field names
0
gravatar for psutton
7 months ago by
psutton0
psutton0 wrote:

topGO::annFUN.org() has the following lines:

## function annFUN.org() to work with the "org.XX.eg" annotations
annFUN.org <- function(whichOnto, feasibleGenes = NULL, mapping, ID = "entrez") {

     # [some lines have been omitted]

    geneID <- keyName[tolower(ID)]
    .sql <- paste("SELECT DISTINCT ", geneID, ", go_id FROM ", tableName[tolower(ID)],
                  " INNER JOIN ", paste("go", tolower(whichOnto), sep = "_"),
                  " USING(_id)", sep = "")
    retVal <- dbGetQuery(get(paste(mapping, "dbconn", sep = "_"))(), .sql)

    ## restric to the set of feasibleGenes
    if(!is.null(feasibleGenes))
        retVal <- retVal[retVal[[geneID]] %in% feasibleGenes, ]

    ## split the table into a named list of GOs
    return(split(retVal[[geneID]], retVal[["go_id"]]))
}

I created a custom OrgDb (for a non-model organism) using AnnotationForge, which I wanted to use with topGO.

The custom OrgDb didn't have tables like go_bp, so topGO wouldn't work at first (which I posted about in https://support.bioconductor.org/p/118713/), but I was able to create that table and get around that problem.

Using my custom OrgDb, topGO failed on the last line of annFUN.org() with the call to split(), because there is no data in retVal[[geneID]]

I debugged this and found out that custom OrgDb files created using AnnotationForge seem to have uppercase field names like SYMBOL and ENSEMBL in the SQLite tables. In contrast, the standard OrgDb packages like org.Hs.eg.db have uppercase columns(), but lowercase field names in the SQLite tables.

The weird thing, is that SQL is case-insensitive, so the SQL query above returns data whether or not the field name stored in geneID is uppercase or lowercase. But retVal[[geneID]] returns NULL when the SQLite table field names are uppercase, because of the line geneID <- keyName[tolower(ID)], which is why my custom OrgDb failed here.

My question is: would it be possible for topGO::annFUN.org to handle upper and lowercase field names more gracefully?

Sorry, I am new to bioconductor and annotation packages, and it is overwhelming at times, since it seems a lot more difficult to work with a non-model organism. It took me a long time to figure out why split() was throwing an error.

topgo annotationforge • 135 views
ADD COMMENTlink written 7 months ago by psutton0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 296 users visited in the last hour