tximport from RSEM expected counts: is it possible to import from a table rather than a file?
1
0
Entering edit mode
Ahdee ▴ 50
@ahdee-8938
Last seen 13 days ago
United States

Hi based on this tutorial, https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html#rsem it looks like its possible to import expected counts from RSEM however files are not always available, for example GTEx set from Xena UCSC only has expected counts row by genes and columns sample. Is there a way to use tximport with this table instead?

thanks!

tximport limma rnaseq • 2.5k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 22 hours ago
United States

No, its just designed for per sample. Why not just use read.delim()? What do you need from tximport?

ADD COMMENT
0
Entering edit mode

yes thanks; I usually just do that. Was just wondering if there was something tximport do differently.

ADD REPLY
0
Entering edit mode

So am I correct in hearing that tximport doesnt do anything with transcript lengths at this stage?

Per Kevin Blighe, the log2(x)+1 normalization still needs to be reversed before running DESeq2 on, e.g. https://dev.xenabrowser.net/datapages/?dataset=gtex_Kallisto_est_counts&host=https%3A%2F%2Ftoil.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443

ADD REPLY
0
Entering edit mode

I don't know what you're asking. In the above thread, we decided tximport cannot be used if the sample data is not available. What case are you referring to?

If you have access to aggregated, scaled, transformed data only, I wouldn't recommend tximport or DESeq2, as these are explicitly designed for when per-sample count data is available.

ADD REPLY
0
Entering edit mode

Ah, sorry. That was a bit unclear. I -think- that the Xena dataset that I linked is actually per-sample data, just aggregated into a single file.

As I understand it, that link is the result of 'cbind'-ing the first column (estimated counts) from Kallisto's h5dump.

Question 1: Assuming that the values are actually per-sample Kallisto "estimated counts" fields, would read.delim and DESeq2 be apppropriate?

Question 2: is there some feature (normalization, etc?) in tximport to be gained by splitting this merged data.frame into one-file-per-sample TSVs in the format: "ENST+est_counts+feature length" and using tximport, over just using read.delim?

ADD REPLY
0
Entering edit mode

1) if you want to collapse to gene-level and perform testing, you should be taking account of the gene length, either through an offset or scaledTPM. So then if you have counts but not length or abundance, I wouldn't recommend this as input to DESeq2 (or, it's not the tximport recommended input).

2) Yes, read the tximport paper.

ADD REPLY
0
Entering edit mode

Okay, seems like it should work. I'll go and do it correctly. I did skim the paper, but obviously I should read the entirety.

ADD REPLY

Login before adding your answer.

Traffic: 934 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6