Question

Advice needed: Using limma for differential pathway analysis with continuous disease progression scores

0

Entering edit mode

Tadeoye ▴ 20

@98d490f8

Last seen 6 weeks ago

United States

Urgent advice needed, please!

I'm analyzing pathway activities derived using AUCell from single-cell RNA sequencing data. My specific research question is to idenfity which pathways vary significantly along disease progression. The disease progression scores are continuous values from 0 to 1 obtained per subject/patient.

I'm curious about how to come up with an appropriate design matrix, while including covariates like age and sex, and then how to use limma for testing associations between pathway activities and the continuous disease progression metric. There are about 84 patients altogether but the actual cell X pathway data matrix contains over a 100k cells (representing pseudo-replicates at patient level) for each pathway.

The current model I have looks something like this.

design <- model.matrix(~ 0 + CPS + age + sex, data=data)

fit <- lmFit(activity_scores_matrix, design)

fit <- eBayes(fit)

result <- topTable(fit, adjust.method = "BH", 
                                number = Inf, confint = TRUE) %>% 
                        arrange(P.Value)

Is this the appropriate way to model continuous relationships in limma? I'm particularly uncertain about using eBayes() with a continuous predictor since limma was originally designed for categorical comparisons. Should I include patient ID's as random effects using duplicateCorrelation ?
Would alternative approaches be more suitable for detecting pathway activity changes along a continuous disease progression scale? I remember reading something about splines a while back, but I can't quite place my hands on it.

Any guidance on best practices for using limma with continuous variables would be greatly appreciated. I want to ensure I'm using the most appropriate statistical framework for this analysis.

Thank you for your help!

limma DifferentialExpression Designmatrix • 334 views

ADD COMMENT • link 8 weeks ago Tadeoye ▴ 20

score 0 · Answer 1 · 2024-12-10

0

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

limma has always handled any combination of continuous or categorical variables from the very beginning. You simply add continuous variables to the model and test them exactly as you would a categorical variable with 2 levels.

Please do not remove the intercept term by typing 0+ in the model formula. That is almost never appropriate with continuous variables.