Although this type of question would have been asked, let me post here as I couldn't find exact answer. We're facing the same kind of situation discussed in the vignette.

One of groups shows high variability, and we finally came to know the group prepared by different batches. Say, we have multiple groups (A, B, and C) with 2 conditions (X, Y) and 3 replicates for each. We'd like to list DEGs between condition X and Y in each group, and plan to do some posthoc analysis. Group A and B came from the same batch, and show reasonable clusters. However, in group C there is a high variability across conditions, and we came to know that group was prepared differently, say in batch 2 and 3.


A - 1 - X - batch1

A - 2 - X - batch1


B - 2 - Y - batch1

B - 3 - Y - batch1

C - 1 - X - batch2

C - 2 - X - batch3

C - 3 - X - batch3

C - 1 - Y - batch3

C - 2 - Y - batch2

C - 3 - Y - batch3

We excluded C for the analysis of A and B like explained in the vignette. Looks reasonable result.

When analyzing C, what do you think the best possible approach is?

  1. Just analyzing group C (e.g. ~ batch + condition) without A and B.
  2. Comparing A, B, and C altogether (this means giving up considering batches...), and then just extract DEG for the group C.

Condition effect across group is not the current interest, so splitting C could be better (not ideal, though).

Please advise, experts!

If group C has batches with considerable variability, I would recommend what you've done for A and B, and then (1).


