distinguish DataSource from Provider and ProviderVersion in makeTxDbPackage
1
1
Entering edit mode
@timotheeflutre-6727
Last seen 5.6 years ago
France

In the current GenomicFeatures package (Bioc 3.2), makeTxDbPackage uses the "Data source" metadata to set the Provider and ProviderVersion fields of the txdb template, even though these should sometimes (always?) be different. Could this be fixed for Bioc 3.3?

Moreover, it would help if makeTxDbFromGFF had arguments allowing the user to specify these provider meta-data.

Also in makeTxDbPackage, the SOURCEURL field of the txdb template should also be filled by "Resource URL" if it exists, whether or not the txdb object was made from the UCSC or BioMart.

ps: here is a link to the relevant code on Bioc's GitHub mirror

genomicfeatures txdb maketxdbfromgff metadata • 1.5k views
ADD COMMENT
2
Entering edit mode
@valerie-obenchain-4275
Last seen 2.9 years ago
United States

Looking at these 2 lines from GenomicFeatures::makeTxDbPackage:

    PROVIDER=.getMetaDataValue(txdb,'Data source')
    PROVIDERVERSION=.getTxDbVersion(txdb)


and this helper:

.getTxDbVersion <- function(txdb){
    type <- .getMetaDataValue(txdb,'Data source')

    if (type=="UCSC") {
        version <- paste(.getMetaDataValue(txdb,'Genome'),
                         "genome based on the",
                         .getMetaDataValue(txdb,'UCSC Table'), "table")
    } else if (type=="BioMart") {
      version <- .getMetaDataValue(txdb,'BioMart database version')
    } else {
      version <- .getMetaDataValue(txdb,'Data source')
    }
    version
}


PROVIDER is set with 'Data source' from the metadata however PROVIDERVERSION only uses 'Data source' if the PROVIDER is not 'UCSC' or 'BioMart'. It sounds like you really want the option to specify these fields in which case they should be passed to makeTxDbPackage() not makeTxDbFromGFF() since 'provider' and 'provider_version' fields are specific to the DESCRIPTION and not the metadata table of a TxDb. I've added 'provider' and 'providerVersion' args to makeTxDbPackage().

I agree SOURCEURL should always come from 'Resource Url' and have made that change.

We've just had the 3.3 release and now changes to that branch are limited to bug fixes. These modifications have been checked into the devel branch, GenomicFeatures 1.25.3.

Valerie

ADD COMMENT
1
Entering edit mode

great, thanks!

ADD REPLY

Login before adding your answer.

Traffic: 872 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6