Hi all,
I'm having trouble understanding the numbers I get when running the TMM normalisation on my data set from S. cerevisiae.
I have several samples, but one of them is a lot smaller than the others (X_97_1h_1, s. table below).
from reading some of the posts in this forum I understand that edgeR takes not exactly this norm.factors into account, but the product of norm.factor * lib.size. This value gives me the effective library size which is than used in downstream analysis.
What I don't understand is how the norm.factor is calculated. Even though the library size is taken into account when calculating differential gene expression, how come the norm.factors are still similar, when the libraries are not?
Another factor in my case is what parameter is than used later on, the effective library size or the given norm.factors in the table (y$samples$norm.factors)
This is important in my case, as I am not just using edgeR to normalize the data, but also use it outside of edgeR to visualize it. I am calculating the overlap of reads over the genome (using summariseOverlaps) and counts reads into bins of 500 bases long. To create the wig files for the browser I am taking the raw read counts of each library and multiply it with the given norm.factor (e.g. for X_97_1h_1 I am taking 0.9584462 as a norm.factor to multiply each of my rows in the count table.).
This is how I ran the normalisation (I only have duplicates):
>y <- DGEList(counts=countTable, group= rep(1:18, each=2))
>y <- calcNormFactors(y, method="TMM")
> y$samples
group lib.size norm.factors
51_248_0h_1 51_248_0h_1 4801445 1.0390857
51_248_0h_2 51_248_0h_2 1644252 1.0393724
51_248_1h_1 51_248_1h_1 3297504 1.0464985
51_248_1h_2 51_248_1h_2 2222688 1.0171469
51_248_4h_1 51_248_4h_1 4679946 1.0074098
51_248_4h_2 51_248_4h_2 3024524 0.9885031
ctrl_110_0h_1 ctrl_110_0h_1 3769422 1.0582047
ctrl_110_0h_2 ctrl_110_0h_2 3650192 1.0630055
ctrl_110_1h_1 ctrl_110_1h_1 4275661 1.0542222
ctrl_110_1h_2 ctrl_110_1h_2 4348709 1.0602291
ctrl_110_4h_1 ctrl_110_4h_1 3507238 1.0648724
ctrl_110_4h_2 ctrl_110_4h_2 4324472 1.0700604
ctrl_248_0h_1 ctrl_248_0h_1 4819007 0.9628215
ctrl_248_0h_2 ctrl_248_0h_2 4573513 1.0564647
ctrl_248_1h_1 ctrl_248_1h_1 4834486 0.9610896
ctrl_248_1h_2 ctrl_248_1h_2 4297190 1.0468209
ctrl_248_4h_1 ctrl_248_4h_1 7834379 1.0270228
ctrl_248_4h_2 ctrl_248_4h_2 5017690 1.0747524
ctrl_97_0h_1 ctrl_97_0h_1 4025521 1.0027374
ctrl_97_0h_2 ctrl_97_0h_2 3803086 1.0271279
ctrl_97_1h_1 ctrl_97_1h_1 4124150 1.0060742
ctrl_97_1h_2 ctrl_97_1h_2 4114575 1.0235497
ctrl_97_4h_1 ctrl_97_4h_1 3468361 1.0699728
ctrl_97_4h_2 ctrl_97_4h_2 4065669 1.0654684
X_110_0h_1 X_110_0h_1 2763927 0.9538789
X_110_0h_2 X_110_0h_2 2882729 0.9265470
X_110_1h_1 X_110_1h_1 3059491 0.9551635
X_110_1h_2 X_110_1h_2 3208547 0.9368711
X_110_4h_1 X_110_4h_1 2862389 0.9656174
X_110_4h_2 X_110_4h_2 2984518 0.8909318
X_97_0h_1 X_97_0h_1 2811017 0.9374170
X_97_0h_2 X_97_0h_2 2134669 0.8990688
X_97_1h_1 X_97_1h_1 340190 0.9584462
X_97_1h_2 X_97_1h_2 2108722 0.9048725
X_97_4h_1 X_97_4h_1 3646497 0.9503031
X_97_4h_2 X_97_4h_2 2934569 0.9445557
