Obtaining transcript names in RSEM
1
0
Entering edit mode
deena ▴ 10
@deena-7415
Last seen 4.1 years ago
Germany

Hi,

I have performed the abundance estimation using RSEM which outputted the genes.results and isoforms.results. I would like to import the isoforms results into DESeq2 pipeline. I followed the tximport pipleine for RSEM like following but when I checked the rownames, it gave me gene names instead of transcript names

rsem.files=list.files(".","*.isoforms.result")

txi.rsem=tximport(rsem.files, type = "rsem",txOut = T)

Could anyone please guide what mistake I committed.

tximport rsem • 912 views
0
Entering edit mode
@mikelove
Last seen 7 hours ago
United States

The tximport to DESeq2 pipeline described in the tximport vignette is designed for gene level analysis.

0
Entering edit mode

Hi Michael,

Thank you very much for your reply. I am working on a non-model organism for which has denovo assembled transcriptome. I use Trinity suite to align and estimate the count data. Trinity suite can estimate counts using RSEM or kallisto depending upon the user's choice. In case of kallisto, it generates abundance.tsv.isoforms and abudance.tsv.genes and in case of RSEM genes.results and isoforms.results. The estimated/expected counts for the isoforms and gene files are almost same except for those transcripts which has isoforms.

So my point is can I make use of this isforms files into tximport pipeline and DESeq2 pipeline for further analysis ? Also after importing the kallisto or RSEM raw counts into DESeq2, is it advisable to rlog them and use for it plot generation?

1
Entering edit mode

You can do whatever you like with the quantifications. Variance stabilization is a good idea for calculating sample distances or ordination plots like PCA or MDS. You can read in the matrix from the isoforms table using base R functions for RSEM, and txOut with tximport.

0
Entering edit mode

Hi

I tried importing the RSEM.isoform.results files into R like described in vignette. My RSEM.isoform.result file has columns transcript_id gene_id effective_length expected_count TPM FPKM IsoPct. So when tried to import it got the following code

list.files("~/Trinity_kallisto_RSEM/RSEM/","*.isoforms.results")

names(rsem_isoform)="t_0h_1"

tximport(rsem_isoform, type = "rsem",txOut = T)

1 Error: all(c(geneIdCol, abundanceCol, lengthCol) %in% names(raw)) is not TRUE
When I went through the tximport code, it has written in a such way that  for "rsem" option, the code recognizes the only column "gene_id". So when I deleted the column gene_id and renamed the transcript_id as gene_id it worked.

Am I doing the right thing? I tried tx.Out=T, then also I am getting the same error.

1
Entering edit mode

Instead of changing the column names, you should use the tximport arguments: geneIdCol, txIdCol, abundanceCol, countsCol, and lengthCol. If txOut=TRUE, then geneIdCol will be ignored so you can put anything.

We only are currently supporting RSEM's gene-level counts with type="RSEM". It would take more effort to support both, and I didn't have any time to write it so the function does this automatically. The user can always specify the above columns though such that it works. Note that it's simply cbind'ing the columns into matrices for txOut=TRUE, so there's not much to it.

0
Entering edit mode

Hi, I tried using the suggested method by stating the column names..

txi.rsem_isoform <- tximport(files = files_isoform,tx2gene = tx2gene,type = "rsem", txOut = TRUE, geneIdCol = "gene_id", txIdCol = "transcript_id", countsCol = "expected_count", lengthCol = "effective_length", abundanceCol = "TPM")

Still, I also face the same problem as Deena. In fact, I observed that it doesn't even mater which column I name what. The result is always the same.

For example, the command below also resulted in the same results. Which is a bit strange.

txi.rsem_isoform <- tximport(files = files_isoform,tx2gene = tx2gene,type = "rsem", txOut = TRUE, geneIdCol = "expected_count",txIdCol = "transcript_id", countsCol = "gene_id",lengthCol = "effective_length", abundanceCol = "TPM")

0
Entering edit mode