Based on a recent previous post regarding putative batch effect consideration and possible correction for downstream analysis (**Putative batch effect assessment and correction for downstream DE analysis with microarray dataset**), i show a specific part of my code regarding a statistical downstream analysis with limma in R, for an affymetrix microarray (after normalization and filtering, without batch effect correction):

**group <- eset.rma$characteristics_ch1.2.group** # condition/variable of interest

**batch <- eset.rma$characteristics_ch1.3.batch** # the study that each sample originates

`levels(group)`

`[1] "Normal" "SLE"`

`table(group)`

`group`

`Normal SLE `

` 30 150`

`table(batch)`

`batch`

`Additional HCs ILLUMINATE-1 ILLUMINATE-2 `

` 30 74 76`

`table(comb)`

`comb`

`Normal_Additional HCs SLE_ILLUMINATE-1 SLE_ILLUMINATE-2 `

` 30 74 76`

`design <- model.matrix(~group + batch)`

`fit <- lmFit(eset.filtered,design)`

`Coefficients not estimable: `

**batchILLUMINATE-2**

`Warning message:`

`Partial NA coefficients for 28982 probe(s) `

`head(design)`

` (Intercept) groupSLE batchILLUMINATE-1 batchILLUMINATE-2`

`1 1 0 0 0`

`2 1 0 0 0`

`3 1 0 0 0`

`4 1 0 0 0`

`5 1 0 0 0`

`6 1 0 0 0`

`fit2 <- eBayes(fit,robust=TRUE,trend=TRUE)`

`Warning messages:`

`1: In out$var.prior[is.na(out$var.prior)] <- 1/out$s2.prior :`

` number of items to replace is not a multiple of replacement length`

`2: In ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim, :`

` Estimation of var.prior failed - set to default value`

`head(fit2$coefficients)`

` (Intercept) groupSLE batchILLUMINATE-1`

`TC01000001.hg.1 8.544837 -0.74892092 0.01868471`

`TC01000004.hg.1 5.747258 0.38620409 -0.29748385`

`TC01000005.hg.1 11.016003 -0.19336129 -0.19723122`

`TC01000006.hg.1 12.578547 -0.05337643 -0.04274937`

`TC01000007.hg.1 13.000369 -0.15541910 -0.09165933`

`TC01000008.hg.1 8.235910 0.04377552 -0.27044804`

` batchILLUMINATE-2`

`TC01000001.hg.1 NA`

`TC01000004.hg.1 NA`

`TC01000005.hg.1 NA`

`TC01000006.hg.1 NA`

`TC01000007.hg.1 NA`

`TC01000008.hg.1 NA`

Overall, i can understand from the above warnings, that due to the possible confounding of my variable of interest –**group with the batch variable** ,there are possibly no available degrees of freedom to compute this specific coef. However, regarding my main goal, that is the **groupSLE coef/comparison**, can i "trust" my above results, when i include the batch variable in the design matrix, **even the relative coef batchILLUMINATE-2** is not estimable ?

(*My final notion for including the batch, is despite the fact that from the plots in my previous post mentioned there is no strong signal for batch effect[-link above], however, i would like to take into account the information about the origin of study in each sample-in order, to perhaps avoid possible cased of DE genes that would be due to batch differences in expression).

Any help/suggestion would be much appreciated !!!