edgeR calcNormFactors or normalization factors with voom?
Entering edit mode
XTR5 • 0
Last seen 11 months ago
United States

It is considered best practice to use the calcNormFactors function in edgeR or rely on voom for normalization across samples?

Using calcNormFactors:

voom(calcNormFactors(DGEList(counts)), design)

I understand that any normalization factors found in counts will still be used even if normalize.method="none"

Alternatively, just using voom:

voom(counts, design)

I've tried both approaches on my datasets and they give very, very similar results, but wondering if one approach is considered best practice here.

edgeR limma • 1.2k views
Entering edit mode

See https://f1000research.com/articles/5-1408, it recommends calcNormFactors.

Entering edit mode
Last seen 2 days ago
United States

The calcNormFactors function doesn't normalize anything. It calculates normalization factors that are intended to do a better job than the raw library size for performing the scale normalization that voom does by default. In other words, if you use calcNormFactors first, it will use the TMM method to estimate the effective library size, and then add an updated 'norm.factors' column to the samples data.frame in your DGEList object. By default the 'norm.factors' column is just all 1s. As an example:

> y <- matrix(rpois(1000, 5), 200)
> dge <- DGEList(y)
> dge$samples
        group lib.size norm.factors
Sample1     1      977            1
Sample2     1     1031            1
Sample3     1      968            1
Sample4     1      965            1
Sample5     1      969            1
> dge <- calcNormFactors(dge)
> dge$samples
        group lib.size norm.factors
Sample1     1      977    0.9916904
Sample2     1     1031    1.0017576
Sample3     1      968    0.9842965
Sample4     1      965    1.0405360
Sample5     1      969    0.9828296

And then when you compute the logCPM values those norm.factors are used to adjust the library size.

You should pretty much always use calcNormFactors because it is designed to account for compositional bias. If that's not actually a problem for your data (like this fake data I just made up) then it won't really change things. But if it is a problem, you will account for the bias.

If you use the normalize.method in voom, then it will additionally normalize using normalizeBetweenArrays. You could hypothetically use an additional normalization method like that, and there are instances where I thought it was a reasonable thing to do, but that's a pretty rare event. For probably the high 90% of analyses you should just use calcNormFactors and no normalize.method.

Entering edit mode

I understood what calcNormFactors is doing. Thank you for clarifying that it is " intended to do a better job than the raw library size for performing the scale normalization that voom does by default."


Login before adding your answer.

Traffic: 310 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6