Entering edit mode

Hi all,

could anyone please help how to design model in limma for categorical variables association with methylation epic array data?

I am using below model but keep getting error; status is the categorical variable here;

```
#model matrix
var1<-model.matrix(~status + as.factor(sex) + Age +CD8T +CD4T +NK + Bcell +Mono+smoking + PC1 + PC2 + PC3, data=targets2)
fit1<-lmFit(mval,var1)
fit1<-eBayes(fit,trend=TRUE, robust=TRUE)
probe<-topTable(fit1,adjust="BH",coef=2,num=Inf)
sig.probe<-probe[which(probe$adj.P.Val<=0.05),]
```

Many thanks

Thanks, this model previously working effectively when analyzing the association between continuous variables and DNA methylation. However, when attempting to assess the relationship with categorical variables, the script appears to hang indefinitely on the high-performance computing (HPC) system. In contrast, analyzing continuous variables only took approximately two hours to complete.

But will try again!

How many levels does

`status`

have? Please type`table(status)`

and show the output.What you are describing is not a limma error, but an issue with the size of the dataset and your computational resources, for example running out of memory.

Computation time in limma is determined by the size of the dataset (which you haven't described) and by the number of columns to the design matrix. For a given number of design matrix columns, computation time is unaffected by whether the original variables were continous or categorical. The problem has nothing to do with categorical vs continuous but simply the size of the fitted model

I note that you have posted several previous questions on this forum where you reported successfully fitting limma models to methylation data including categorical variables, so you must already know from your own experience that there's no particular problem with categorical variables. Your current model as several categories variables, not just

`status`

but also`sex`

and probably`smoking`

.thanks, Gordon Smyth, it is sorted now. it was just a large sample size. above command work perfectly with both continue and categorical variable association analysis.