Putative batch effect assessment and correction for downstream DE analysis with microarray dataset
Entering edit mode
svlachavas ▴ 780
Last seen 2 days ago
Germany/Heidelberg/German Cancer Resear…

Dear Community,

i'm currently analyzing a dataset of human HTA 2.0 affymetrix microarrays, for statistical analysis of a two-group comparison (healthy subjects and different subject samples from a autoimmune chronic disease).

After import/pre-processing/normalization, i created further some EDA plots, to access/investigate any putative batch effects, as i have the following information, that both healthy controls, as the disease samples belong to 3 different studies (only the control samples belong to the same study/batch)-the links for the MDS plot and a hc dendrogram are the following:



(* for simplicity, the different color in both plots represents the different origin/study, whereas the main condition/label is Normal & SLE phenotypes)

So, from an initial investigation of the above 2 plots, it does not seem any severe batch effect regarding the origin/study (Additional HCs=control samples, SLE=ILLUMINATE-1  & ILLUMINATE-2), which could imply an severe correction. However, to be certain for any downstream statistical comparison with limma, i should just include the batch information in my linear model, in order to take into account this information ?

Or, due to the following :


Additional HCs        ILLUMINATE-1               ILLUMINATE-2 
            30               74                        76 

group <- pData(eset.rma)$characteristics_ch1.2.group # main variable for downstream DE comparison

Normal    SLE 
    30      150 

comb <- paste0(pData(eset.rma)$characteristics_ch1.2.group,


Normal_Additional HCs      SLE_ILLUMINATE-1      SLE_ILLUMINATE-2 
                   30                    74                    76 

because the "batches" differ in number, it is not generally then advisable to include batch adjustment at all in the design matrix ?

Or overall, despite not seeing a strong batch effect in the above initial plots, there is a possible confunding of my batch levels with my condition of interest, and thus some batch effect correction should be applied ? like ComBat ?

Thank you in advance,


hta2.0 limma batch effect affymetrix microarrays ComBat • 832 views

Login before adding your answer.

Traffic: 251 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6