We have RNA-seq data (three biological replicates in a single condition) and we want to calculate TPM with the sole purpose of ranking genes by expression level (in quantiles) .

I understand that, while TPM uses the total reads number to normalize sequencing depth and library size, for DE  is taken a more sophisticated approach (TMM in edgeR or "median ratio method" in DEseq2).

I have two questions:

1) Should I normalize my counts with a "DE method" instead of relying on the total reads system?


2) Which is the best way to collapse the three replicates? 

I was thinking of taking the median TPM for each gene, but there might be some smarter way.


Thank you


TPM is for comparing expression strength of different genes, this is what you want. The DESEQ normalisation method is for doing differential expression between groups, not comparing expression strength. I'd just take the mean for the replicates, but the median is reasonable.


