Hi all,
I'm having trouble understanding the numbers I get when running the TMM normalisation on my data set from S. cerevisiae.
I have several samples, but one of them is a lot smaller than the others (X_97_1h_1
, s. table below).
from reading some of the posts in this forum I understand that edgeR takes not exactly this norm.factors
into account, but the product of norm.factor * lib.size
. This value gives me the effective library size which is than used in downstream analysis.
What I don't understand is how the norm.factor is calculated. Even though the library size is taken into account when calculating differential gene expression, how come the norm.factors are still similar, when the libraries are not?
Another factor in my case is what parameter is than used later on, the effective library size or the given norm.factors in the table (y$samples$norm.factors
)
This is important in my case, as I am not just using edgeR
to normalize the data, but also use it outside of edgeR
to visualize it. I am calculating the overlap of reads over the genome (using summariseOverlaps
) and counts reads into bins of 500 bases long. To create the wig files for the browser I am taking the raw read counts of each library and multiply it with the given norm.factor (e.g. for X_97_1h_1
I am taking 0.9584462
as a norm.factor
to multiply each of my rows in the count table.).
This is how I ran the normalisation (I only have duplicates):
>y <- DGEList(counts=countTable, group= rep(1:18, each=2)) >y <- calcNormFactors(y, method="TMM") > y$samples group lib.size norm.factors 51_248_0h_1 51_248_0h_1 4801445 1.0390857 51_248_0h_2 51_248_0h_2 1644252 1.0393724 51_248_1h_1 51_248_1h_1 3297504 1.0464985 51_248_1h_2 51_248_1h_2 2222688 1.0171469 51_248_4h_1 51_248_4h_1 4679946 1.0074098 51_248_4h_2 51_248_4h_2 3024524 0.9885031 ctrl_110_0h_1 ctrl_110_0h_1 3769422 1.0582047 ctrl_110_0h_2 ctrl_110_0h_2 3650192 1.0630055 ctrl_110_1h_1 ctrl_110_1h_1 4275661 1.0542222 ctrl_110_1h_2 ctrl_110_1h_2 4348709 1.0602291 ctrl_110_4h_1 ctrl_110_4h_1 3507238 1.0648724 ctrl_110_4h_2 ctrl_110_4h_2 4324472 1.0700604 ctrl_248_0h_1 ctrl_248_0h_1 4819007 0.9628215 ctrl_248_0h_2 ctrl_248_0h_2 4573513 1.0564647 ctrl_248_1h_1 ctrl_248_1h_1 4834486 0.9610896 ctrl_248_1h_2 ctrl_248_1h_2 4297190 1.0468209 ctrl_248_4h_1 ctrl_248_4h_1 7834379 1.0270228 ctrl_248_4h_2 ctrl_248_4h_2 5017690 1.0747524 ctrl_97_0h_1 ctrl_97_0h_1 4025521 1.0027374 ctrl_97_0h_2 ctrl_97_0h_2 3803086 1.0271279 ctrl_97_1h_1 ctrl_97_1h_1 4124150 1.0060742 ctrl_97_1h_2 ctrl_97_1h_2 4114575 1.0235497 ctrl_97_4h_1 ctrl_97_4h_1 3468361 1.0699728 ctrl_97_4h_2 ctrl_97_4h_2 4065669 1.0654684 X_110_0h_1 X_110_0h_1 2763927 0.9538789 X_110_0h_2 X_110_0h_2 2882729 0.9265470 X_110_1h_1 X_110_1h_1 3059491 0.9551635 X_110_1h_2 X_110_1h_2 3208547 0.9368711 X_110_4h_1 X_110_4h_1 2862389 0.9656174 X_110_4h_2 X_110_4h_2 2984518 0.8909318 X_97_0h_1 X_97_0h_1 2811017 0.9374170 X_97_0h_2 X_97_0h_2 2134669 0.8990688 X_97_1h_1 X_97_1h_1 340190 0.9584462 X_97_1h_2 X_97_1h_2 2108722 0.9048725 X_97_4h_1 X_97_4h_1 3646497 0.9503031 X_97_4h_2 X_97_4h_2 2934569 0.9445557