tximeta remote versus local reference files
Entering edit mode
igor ▴ 40
Last seen 3 months ago
United States

BioC 3.8 was just released and there is a great new package tximeta (at least if you work with Salmon). I am still trying to properly understand how it works. The vignette contains the following section:

However, to avoid downloading remote GTF files during this vignette, we will point to a GTF file saved locally (in the tximportData package). We link the transcriptome of the Salmon index to its locally saved GTF. The standard recommended usage of tximeta would be the code chunk above, or to specify a remote GTF source, not a local one. This following code is therefore not recommended for a typically workflow, but is particular to the vignette code.

Why is it not recommended to point to the locally saved GTF? I assume this is to allow for more reproducible analysis, but the primary input for the tool are Salmon files that were already run with a locally saved index and the same index is passed to makeLinkedTxome() as well. If certain reference files are already local anyway, why not just keep them all local?

tximeta • 347 views
Entering edit mode
Last seen 19 hours ago
United States

hi Igor,

The first tximeta question!

That's a good point. I guess my reasoning was this:

If the GTFs are downloaded programmatically (by tximeta) and saved in a local cache (maintained by BiocFileCache) there is absolutely no chance of error. Whereas it is possible that a user could accidentally pick the wrong GTF in a directory, and so that would introduce a point of error. If I look in my "annotation" directory on my cluster, I have dozens of GTF files and some of them look pretty similar with small differences in the file ending (either version number or cDNA vs ncRNA vs all). So as long as it's a txome with a match in my hash table (which will be expanding as we develop the software and hook up to external resources), then I can guarantee the correct metadata gets attached. Also the GTF files are relatively small (typically <100 Mb) and so download in a few seconds on a good connection.

But we provide the linkedTxome() mechanism as a way for users to connect txomes that aren't found in the hash table for whatever reason.

Does that help explain the motivation?


Login before adding your answer.

Traffic: 457 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6