Entering edit mode
Cittaro Davide
▴
240
@cittaro-davide-5375
Last seen 10.2 years ago
Hi Simon,
On May 29, 2013, at 11:46 AM, Simon Anders <anders at="" embl.de=""> wrote:
> The notion of "calculating cpm on normalized counts" is hence a
> contradiction in terms.
Would you like to expand this sentence? I see it is not uncommon to
evaluate counts in cpm after normalization. I'm thinking at edgeR and
limma (that normalize by TMM)...
Moreover, I would like to exploit this thread for another point which
still is not clear to my simple mind: normalizing counts (either by
TMM or by geommean) makes the comparison at feature level possible,
that's why we all trust DESeq (edgeR and limma::voom) and we agree
RPKM is evil for that purpose :-) But. Once you have normalized
counts, how would you rank features according to their abundance
"within" the sample? How can you tell feature A is more represented
than feature B in the same sample? Can you just use normalized counts
for that?
I'm asking this because I'm facing some experimental data (not RNA-
seq) where the features are huge genomic domains (megabases, spotted
by chip-seq) that change between conditions (in terms of abundance,
position and enrichment). I can describe the differences in terms of
domain length (and genomic associations to genes, for example), but
what about their "height"? I cannot use classical peak height as for
normal ChIP-seq data, because that makes no sense at all, and I'm
forced to use RPKM.
/me confused
thanks
d