I am sorry for this naive question, but is critically related to a basic step in RNA seq analyses. I am trying different normalization approach mainly CPM and TMM Assuming the following:
- my raw count is named as 'counts.keep'
- I am using a DGElist called 'dgeObj'
First I tried to get the log2 of just the raw count and created an object called M
Second, I created the CPM of the log2 using the dgeObj, this object is not TMM normalized ... This creating logcounts (I assume this is the CPM of the log2 count)
logcounts <- cpm(dgeObj,log=TRUE)
Third, I tried to get the TMM normalized count, and here is my question. I used this code
dgeObj <- calcNormFactors(dgeObj) logCPM <- cpm(dgeObj, log = TRUE)
I first make TMM normalization on the dgeObj, then used cpm function with log =TRUE on this dgeObj
What the cpm actually is doing here? is it making a cpm on top of the TMM normalized reads ? or only doing TMM ?
In another word, I would like to know if I plot the logCPM object, will this be only the TMM normalization or CPM + TMM normalization. So my confusion is can one apply CPM on top on TMM approach or normalize using only one of them to assess the normalization ?