Question: tximeta remote versus local reference files
0
gravatar for igor
6 months ago by
igor20
United States
igor20 wrote:

BioC 3.8 was just released and there is a great new package tximeta (at least if you work with Salmon). I am still trying to properly understand how it works. The vignette contains the following section:

However, to avoid downloading remote GTF files during this vignette, we will point to a GTF file saved locally (in the tximportData package). We link the transcriptome of the Salmon index to its locally saved GTF. The standard recommended usage of tximeta would be the code chunk above, or to specify a remote GTF source, not a local one. This following code is therefore not recommended for a typically workflow, but is particular to the vignette code.

Why is it not recommended to point to the locally saved GTF? I assume this is to allow for more reproducible analysis, but the primary input for the tool are Salmon files that were already run with a locally saved index and the same index is passed to makeLinkedTxome() as well. If certain reference files are already local anyway, why not just keep them all local?

tximeta • 154 views
ADD COMMENTlink modified 6 months ago by Michael Love23k • written 6 months ago by igor20
Answer: tximeta remote versus local reference files
2
gravatar for Michael Love
6 months ago by
Michael Love23k
United States
Michael Love23k wrote:

hi Igor,

The first tximeta question!

That's a good point. I guess my reasoning was this:

If the GTFs are downloaded programmatically (by tximeta) and saved in a local cache (maintained by BiocFileCache) there is absolutely no chance of error. Whereas it is possible that a user could accidentally pick the wrong GTF in a directory, and so that would introduce a point of error. If I look in my "annotation" directory on my cluster, I have dozens of GTF files and some of them look pretty similar with small differences in the file ending (either version number or cDNA vs ncRNA vs all). So as long as it's a txome with a match in my hash table (which will be expanding as we develop the software and hook up to external resources), then I can guarantee the correct metadata gets attached. Also the GTF files are relatively small (typically <100 Mb) and so download in a few seconds on a good connection.

But we provide the linkedTxome() mechanism as a way for users to connect txomes that aren't found in the hash table for whatever reason.

Does that help explain the motivation?

ADD COMMENTlink modified 6 months ago • written 6 months ago by Michael Love23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 113 users visited in the last hour