Search
Question: Plotting Gene Expression Data From Multiple Batches
1
12 months ago by
thyagoleal20
thyagoleal20 wrote:

I'm currently analyzing 4 microarray experiments in an integrated manner. For this, I'm using limma accounting for the batches through the design matrix:

design <- model.matrix(~0+Class+Batch, data = phenot)

Since removing the batch effects, for example, with Combat is not advisable before DEG analysis through limma. Can I correct the batch effects and then plot the values? For example, I've found 10 TOP DEG, so I want to do a boxplot with those 10 genes from the combined data sets. The problem is, doing this using the combined dataset without accounting for the batch will introduce cluster in the plot. My idea is to remove the batch effects and the plot these values. Is this feasible? I'm changing the values, so I'm not very comfortable in doing this.

Thanks.

modified 12 months ago by Jenny Drnevich1.9k • written 12 months ago by thyagoleal20
1
12 months ago by
Jenny Drnevich1.9k
United States
Jenny Drnevich1.9k wrote:

Use the removeBatchEffect() function to get data values suitable for visualization without the batch effects. Say you do your modeling like this where Data is an EList :

fit <- lmFit(Data, design)

To remove your batch effects, do:

design2 <-  model.matrix(~0+Class, data = phenot)

no.batch <- removeBatchEffect(Data, design = design, batch = Batch)

You can then use no.batch in any plotting, heatmap, clustering, etc.

Best,
Jenny