edgeR GLM fitted values
1
0
Entering edit mode
Mattia ▴ 10
@mattia-9769
Last seen 4.8 years ago
Milano

Hi,

I'm wondering if it could be possible to use "fitted values" from edgeR GLM model as normalized pseudo counts to be used as expression data matrix for further classification purpose. I implemented GLM because I needed to correct data for several covariates.

Thanks,

Mattia.

edgeR GLM classification rnaseq • 2.2k views
ADD COMMENT
1
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 18 minutes ago
The city by the bay

The fitted values account for library sizes, but they are not "normalized" with respect to library size. Samples with larger library sizes will have larger fitted values, which is the opposite of what one would normally expect from normalization. Similarly, the effect of any nuisance covariates in the GLM will still be included in the fitted values.

In addition, the fitted values won't contain observation-specific errors. This means that they're proportional to the mean of counts rather than the counts themselves. This may not be desirable for downstream applications - groups with more samples will have more precise estimates of the mean, whereas groups with fewer samples will have more variable fitted values. Treating the fitted values as "counts" with similar mean-variance relationships would be inappropriate.

If you want corrected and normalized expression values for each sample, I would suggest using the cpm method. You can then use removeBatchEffect to get rid of any batch effects or problematic covariates. If you want corrected and normalized (log-)expression values for each group, I would use the GLM coefficients, though their interpretation would depend on how you've parametrized your model.

ADD COMMENT

Login before adding your answer.

Traffic: 742 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6