3.3 years ago by

Cambridge, United Kingdom

The fitted values account for library sizes, but they are not "normalized" with respect to library size. Samples with larger library sizes will have larger fitted values, which is the opposite of what one would normally expect from normalization. Similarly, the effect of any nuisance covariates in the GLM will still be included in the fitted values.

In addition, the fitted values won't contain observation-specific errors. This means that they're proportional to the mean of counts rather than the counts themselves. This may not be desirable for downstream applications - groups with more samples will have more precise estimates of the mean, whereas groups with fewer samples will have more variable fitted values. Treating the fitted values as "counts" with similar mean-variance relationships would be inappropriate.

If you want corrected and normalized expression values for each sample, I would suggest using the `cpm`

method. You can then use `removeBatchEffect`

to get rid of any batch effects or problematic covariates. If you want corrected and normalized (log-)expression values for each group, I would use the GLM coefficients, though their interpretation would depend on how you've parametrized your model.