Search
Question: Obtaining transcript names in RSEM
0
gravatar for deena
29 days ago by
deena0
Germany
deena0 wrote:

Hi,

I have performed the abundance estimation using RSEM which outputted the genes.results and isoforms.results. I would like to import the isoforms results into DESeq2 pipeline. I followed the tximport pipleine for RSEM like following but when I checked the rownames, it gave me gene names instead of transcript names 

rsem.files=list.files(".","*.isoforms.result")

txi.rsem=tximport(rsem.files, type = "rsem",txOut = T)

Could anyone please guide what mistake I committed.

Thanks in advance.

ADD COMMENTlink modified 29 days ago by Michael Love14k • written 29 days ago by deena0
0
gravatar for Michael Love
29 days ago by
Michael Love14k
United States
Michael Love14k wrote:

The tximport to DESeq2 pipeline described in the tximport vignette is designed for gene level analysis.

ADD COMMENTlink written 29 days ago by Michael Love14k

Hi Michael,

Thank you very much for your reply. I am working on a non-model organism for which has denovo assembled transcriptome. I use Trinity suite to align and estimate the count data. Trinity suite can estimate counts using RSEM or kallisto depending upon the user's choice. In case of kallisto, it generates abundance.tsv.isoforms and abudance.tsv.genes and in case of RSEM genes.results and isoforms.results. The estimated/expected counts for the isoforms and gene files are almost same except for those transcripts which has isoforms.

So my point is can I make use of this isforms files into tximport pipeline and DESeq2 pipeline for further analysis ? Also after importing the kallisto or RSEM raw counts into DESeq2, is it advisable to rlog them and use for it plot generation? 

 

 

ADD REPLYlink written 29 days ago by deena0
1

You can do whatever you like with the quantifications. Variance stabilization is a good idea for calculating sample distances or ordination plots like PCA or MDS. You can read in the matrix from the isoforms table using base R functions for RSEM, and txOut with tximport.

ADD REPLYlink written 29 days ago by Michael Love14k

Hi

I tried importing the RSEM.isoform.results files into R like described in vignette. My RSEM.isoform.result file has columns transcript_id gene_id effective_length expected_count TPM FPKM IsoPct. So when tried to import it got the following code

list.files("~/Trinity_kallisto_RSEM/RSEM/","*.isoforms.results")

names(rsem_isoform)="t_0h_1"

tximport(rsem_isoform, type = "rsem",txOut = T)

reading in files with read_tsv
1 Error: all(c(geneIdCol, abundanceCol, lengthCol) %in% names(raw)) is not TRUE
When I went through the tximport code, it has written in a such way that  for "rsem" option, the code recognizes the only column "gene_id". So when I deleted the column gene_id and renamed the transcript_id as gene_id it worked.

Am I doing the right thing? I tried tx.Out=T, then also I am getting the same error.

 

 

 

ADD REPLYlink written 29 days ago by deena0
1

Instead of changing the column names, you should use the tximport arguments: geneIdCol, txIdCol, abundanceCol, countsCol, and lengthCol. If txOut=TRUE, then geneIdCol will be ignored so you can put anything.

We only are currently supporting RSEM's gene-level counts with type="RSEM". It would take more effort to support both, and I didn't have any time to write it so the function does this automatically. The user can always specify the above columns though such that it works. Note that it's simply cbind'ing the columns into matrices for txOut=TRUE, so there's not much to it.

ADD REPLYlink written 29 days ago by Michael Love14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 180 users visited in the last hour