Gene TPM validity with STAR only quantification
1
0
Entering edit mode
SciencyUsagi ▴ 10
@995a78f6
Last seen 14 months ago
United Kingdom

Hi all,

I performed reads alignment using STAR on Ensembl genome with the --quantMode geneCount. I re-organised ReadsPerGene.out.tab and extracted unstranded counts to create a count matrix. I used this count matrix for DEG analysis via DESeq2, but also wanted to generate TPM to input for ssGSEA analyses. To generate TPMs, I followed the formula:

t( t(counts.mat / gene.length) * 1e6 / colSums(counts.mat / gene.length) )

I estimated gene length via the Ensembldb::lengthof function, where:

"the length is the sum of the lengths of all exons of a transcript or a gene. In the latter case the exons are first reduced so that the length corresponds to the part of the genomic sequence covered by the exons."

ssGSEA results on these TPM was quite consistent with literature observations, but my question is whether the approach I took can be considered valid?

Thanks.

RNASeq • 849 views
ADD COMMENT
0
Entering edit mode

Are all exons expressed equally in your samples? I don't get how people generate TPM without knowing the proportion of transcripts for their samples.

ADD REPLY
0
Entering edit mode

Highly doubt that. We ran the pipeline given to us by our bioinformatician, and clearly is wrong. Glad I doubted it.

ADD REPLY

Login before adding your answer.

Traffic: 525 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6