I am testing salmon and kallisto for RNA-seq. both tools outputs ESTIMATED counts and TPM. I have read around and put my notes here https://github.com/crazyhottommy/RNA-seq-analysis/blob/master/salmon_kalliso_STAR_compare.md#counts-versus-tpmrpkmfpkm
My questions are:
1. from the help of tximport function:
character, either "no" (default), "scaledTPM", or "lengthScaledTPM", for whether to generate estimated counts using abundance estimates scaled up to library size (scaledTPM) or additionally scaled using the average transcript length over samples and the library size (lengthScaledTPM). if using scaledTPM or lengthScaledTPM, then the counts are no longer correlated with average transcript length, and so the length offset matrix should not be used.
To my understanding, TPM is a unit that scaled by (effective) feature length first and then sequencing depth. So, what are scaledTPM and lengthScaled TPM? does tximport use the estimate counts to get the TPM?
2. what's the difference among the TPM output by salmon/kallisto and the TPM returned by tximport function?
3. How does tximport mathematically convert counts to TPM if use the estimated counts to get the TPM?
Thanks very much!