Question

Addressing the batch effect in multifactor experiment

0

Entering edit mode

anikng ▴ 10

@anikng-22672

Last seen 3.3 years ago

I have a complex RNASeq data with 3 factors, genotype time and treatment type. Top part of the target file looks like,

Sample Geno Time Treatment
Sampl1  KK  2hr Control
Sampl2  CR  2hr Control
Sampl3  KK  2hr Control
Sampl4  CR  2hr Control
Sampl5  KK  2hr Stress1
Sampl6  CR  2hr Stress1
Sampl7  KK  4hr Stress1
Sampl8  CR  4hr Stress1
Sampl9  KK  4hr Stress2
Sampl10 CR  4hr Stress2
Sampl11 KK  4hr Stress2
Sampl12 CR  4hr Stress2
 ....

I analyzed data with edgeR.Grouped factors and estimated DEGs using makeContrasts between groups,

group<-factor(paste(targets$Geno,targets$Time,targets$Treatment,sep="."))
cbind(targets,Group=group)
y<-DGEList(counts = raw_counts, group = group)

 keep <-rowSums(cpm(y)>=2) >=2
 y <- y[keep, , keep.lib.sizes=FALSE]
 y<-calcNormFactors(y)
 design<-model.matrix(~0+group)
 colnames(design)<-levels(group)
 y<-estimateDisp(y,design)
 fit<-glmQLFit(y,design)

 Normalized_counts<-cpm(y)

Now I added another batch of data for 6hr and I want to include the batch effect in the analysis. After reading similar posts, I think one possible way would be to add a new 'batch' column to target file and address that in the design matrix. Like,

  Sample Geno Time Treatment Batch
  Sampl1    KK  2hr Control   B1
  Sampl2    CR  2hr Control   B1
  .
  Sampl60   CR  2hr Control   B2

And include the batch in the model and proceed as shown above,

 group<-factor(paste(targets$Geno,targets$Time,targets$Treatment, targets$Batch,sep="."))
 design<-model.matrix(~0+group+Batch)

Is that the correct method to adopt? Please note that I will make a pairwise comparison for DE analysis all those comparisons are within a time point. ie, I don't have plan to compare for ex, 2hr control vs 6hr stress.

edgeR BatchEffect RNASeq • 1.0k views

ADD COMMENT • link updated 3.4 years ago by SamGG ▴ 350 • written 3.4 years ago by anikng ▴ 10

score 2 · Accepted Answer · 2020-11-27

2

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

Yes, adding +Batch to the model is the correct general method to adjust for a batch effect.

ADD COMMENT • link 3.4 years ago Gordon Smyth 50k

0

Entering edit mode

I understand the design formula, but I don't understand why the group variable should include targets$Batch. If someone could explain me, I will appreciate.

ADD REPLY • link 3.4 years ago SamGG ▴ 350