Is there any reason to think that normalization (e.g. TMM) doesn't work well with samples that that have very different raw counts?
2
0
Entering edit mode
Lucas • 0
@1286bb8c
Last seen 13 months ago
France

Like the title says, I would just like to know if there is any problems in using samples that have very different raw counts (e.g: 1M raw reads vs 20M raw reads), even if I am applying appropriate normalization.

limma RNASeqData edgeR • 635 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

TMM will not work well for samples where the library size is so small that most of the counts become zero. If the library sizes are sufficiently large to have reasonable coverage of expressed genes, then differences in library size do not pose any problems and have little effect on the effectiveness of normalization.

A library size of 1 million is on the small side, but is probably ok.

ADD COMMENT
0
Entering edit mode
Hilary ▴ 30
@hilary-8596
Last seen 13 months ago
United States

Visualizing the results and inspecting whether replicate samples that are expected to cluster together do, or if the samples with extreme differences appear as outliers on plots, may help. E.g., you may want to try the plotMDS command in edgeR.

ADD COMMENT

Login before adding your answer.

Traffic: 856 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6