Question

Variance explained (coefficient of determination) in glmFit / glmLRT

0

Entering edit mode

Gemma • 0

@2509afca

Last seen 23 months ago

Austria

I am trying to run a model where I am predicting gene expression as a function of some fixed factors, using glmFit to fit the model and glmLRT to get the significance of each fixed effect. I would like to get the amount of variation explained by each fixed effect (i.e. coefficient of determination or similar). Is there a way to do that with an object of class "DGEGLM"?

More concretely, I would like to get the variance in expression for each gene (20 of them) explained by AG, PO and MG (the three fixed effects I am including in the model).

Many thanks in advance!

Here is a link to the data: https://drive.google.com/drive/folders/1Sgc3DTmr3SxNycyFFuO5PCrM8lqhnPaU?usp=sharing

And here the code:


c_AxB_A_r1 = read.csv("83Fx332M_fh_R1_83.csv")
c_AxB_B_r1 = read.csv("83Fx332M_fh_R1_332.csv")
c_AxB_A_r2 = read.csv("83Fx332M_fh_R2_83.csv")
c_AxB_B_r2 = read.csv("83Fx332M_fh_R2_332.csv")

c_BxA_A_r1 = read.csv("332Fx83M_fh_R1_83.csv")
c_BxA_B_r1 = read.csv("332Fx83M_fh_R1_332.csv")
c_BxA_A_r2 = read.csv("332Fx83M_fh_R2_83.csv")
c_BxA_B_r2 = read.csv("332Fx83M_fh_R2_332.csv")

genenames=c_AxB_A_r1$gene

y <- DGEList(counts=cbind(c_AxB_A_r1$counts, c_AxB_A_r2$counts, c_AxB_B_r1$counts, c_AxB_B_r2$counts, 
                          c_BxA_A_r1$counts, c_BxA_A_r2$counts, c_BxA_B_r1$counts, c_BxA_B_r2$counts), genes=genenames)

isexpr <- rowSums(cpm(y)>1) >= 4
y <- y[isexpr, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y)

AG <- factor(c(0, 0, 1, 1, 0, 0, 1, 1))
PO <- factor(c(0, 0, 1, 1, 1, 1, 0, 0))
MG <- factor(c(0, 0, 0, 0, 1, 1, 1, 1))

design <- model.matrix(~AG+PO+MG)
rownames(design) <- colnames(y)

y <- estimateDisp(y, design, robust=TRUE)
y$common.dispersion

fit <- glmFit(y, design)

model_AG <- glmLRT(fit, coef=2)
results_AG <- model_AG$table
FDR <- p.adjust(results_AG$PValue, method="fdr")
results_AG <- cbind(results_AG, FDR, y$genes$genes)
colnames(results_AG)[6] <- "gene"

model_PO <- glmLRT(fit, coef=3)
results_PO <- model_PO$table
FDR <- p.adjust(results_PO$PValue, method="fdr")
results_PO <- cbind(results_PO, FDR, y$genes$genes)
colnames(results_PO)[6] <- "gene"

model_MG <- glmLRT(fit, coef=4)
results_MG <- model_MG$table
FDR <- p.adjust(results_MG$PValue, method="fdr")
results_MG <- cbind(results_MG, FDR, y$genes$genes)
colnames(results_MG)[6] <- "gene"

edgeR • 803 views

ADD COMMENT • link 24 months ago Gemma • 0

score 0 · Answer 1 · 2022-05-04

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

The concept of "coefficient of determination" is only defined for overall regressions, not for individual predictor variables. Assigning how much variability is explained by each individual covariate is not generally possible for any linear model or generalized linear model because the covariates overlap in terms of the variability that they predict. This is not a limitation of edgeR, but a universal property of linear models.

edgeR can return any quantity that would exist for an ordinary univariate glm. For example, one can easily perform a sequential analysis of deviance for each gene, but the proportion of the total deviance attributable to each predictor depends on the order that they are added to the model, i.e., which predictor is adjusted for which.

ADD COMMENT • link 24 months ago Gordon Smyth 50k

0

Entering edit mode

Many thanks for your answer, that makes sense.

I imagine one could then compute a partial or semipartial measures of variance contributed by each fixed factor when considering the other predictors (or variance contributed irrespective of the other predictors - which I guess would be similar to just running the models with only one predictor at a time).

I for example found this package (https://cran.r-project.org/web/packages/partR2/vignettes/Using_partR2.html) that can do such analysis on ordinary glms.

Could you please suggest a way of doing this using edgeR?

Many thanks again!

ADD REPLY • link 24 months ago Gemma • 0