I have a question concerning the "new" scaling method offered by
dtuScaledTPM and how it affects DTU analysis.
So far I have used
scaledTPM which as I understand (correct me if Im wrong) scales the TPM values to library size by multiplying the TPM of a transcript of a sample with the column sum of the count matrix and thus brings them back onto count scale?
dtuScaledTPM additionally includes the transcript length into the library size info, (dividing count based library size by library size calculated from TPM*transcript length). And the transcript length is the median of transcript lengths of all transcripts in a gene ( where the transcript length itself is the average across all samples). Is this correct? And if so, why is this beneficial for DTU analysis. I understand that using
lengthScaledTPM is not advantageous but I cant wrap my head around why this method is better "just" because the transcript length value is the median instead of the mean?
Sorry if this is a rather confusing question. Im seeking to understand why this method should be used for DTU . It would be great to get a worst case toy example where
lengthScaledTPM would not work and also where lack of length scaling in
scaledTPM would not work well.
Thanks in advance