Hello everyone:
I am working on the RNA-seq data with edgeR recently. I tried to find which function normalizes the reads between samples and found that the "betweenLaneNormalization" and "calcNormFactors" could be that function.
I read the description of "betweenLaneNormalization". It says, "Between-lane normalization for sequencing depth and possibly other distributional differences between lanes.", but I only input the gene count of samples not including any information about lanes. I don't know "betweenLaneNormalization" normalize for what. Does the term "lane" mean the lane in the sequencing chip?
Sorry, I put this question in the wrong place.
We want to compare the CPM between samples, so we want to normalize the unwanted variance, like different library sizes and miscounting by the outliers.
As far as I know, betweenLaneNormalization() is a function from EDAseq and it can scale the count I think it can deal with the variance from different library sizes.
I still don't understand the function of calcNormFactors. I found that the cpm() output is different before and after the calcNormFactors, but I don't know how it works.
The standard recommended way to obtain CPM or log-CPM values in edgeR is by
and this will normalize between samples and account for different library sizes. There is no need for any other functions.
Type
help("calcNormFactors")
to read what calcNormFactors() does.Thanks for your advice!!