Question: CQN and EdgeR Library Size for Normalization
0
18 months ago by
shankasal0 wrote:

I'm performing quantile normalization with CQN and then using edgeR on some ATAC-seq samples I have and I'm trying to understand/determine the following:

When setting the values for library size, Should I use the sum of read counts that fall within the peaks from the total peak (performed for each sample) or should I use the total aligned reads per sample.

Thanks

modified 18 months ago • written 18 months ago by shankasal0
Answer: CQN and EdgeR Library Size for Normalization
2
18 months ago by
Aaron Lun25k
Cambridge, United Kingdom
Aaron Lun25k wrote:

I have tended to use the total aligned reads per sample for edgeR's lib.size when performing differential binding analyses, because it is easier to interpret as sequencing depth. Any global increases or decreases in binding (or in this case, accessibility) between conditions would alter the proportion of reads in peaks, conflating technical differences in sequencing depth with actual biological differences in chromatin structure.

For the actual differential analysis, though, it barely matters. The CQN offsets will override any library size specification - and more generally, if you computed TMM normalization factors, they would also compensate for any differences in the library size specification. A different set of library sizes will alter the calculation of the average log-CPMs and predicted log-fold changes, but this should be a very modest effect.

ADD COMMENTlink modified 18 months ago • written 18 months ago by Aaron Lun25k
Answer: CQN and EdgeR Library Size for Normalization
0
18 months ago by
shankasal0 wrote:

Thanks Aaron, that's a satisfying answer. I had been using the total aligned reads and will continue as such.