Question

DESeq2: Paired multifactor test with batch effect - one or more variables in design formula are linear combinations of others

0

Entering edit mode

erin.gill81 ▴ 60

@eringill81-6831

Last seen 5.3 years ago

Canada

Hi, I'm analyzing an RNA-Seq data set and I would like to perform a paired multifactor test that incorporates batch effect. When I use the design formula

~ response + response:PID + response:timepoint

then eliminate columns from the model matrix manually that contain only zeroes, DESeq2 will run the analysis as expected. However, when I change the design to

~ response + response:PID + response:timepoint + batch

I get the familiar error message

Error in checkFullRank(full) : 
 the model matrix is not full rank, so the model cannot be fit as specified.
  One or more variables or interaction terms in the design formula are linear
  combinations of the others and must be removed.

This is what my colData(dds) looks like

sample PID timepoint batch response 
01-0603_on P01  fiftysix   one       PD 
01-0603_pre P01      zero   one       PD 
03-0601_on P01  fiftysix   one       PR  
03-0601_pre P01      zero   one       PR  
03-0606_on P02  fiftysix   one       PR 
03-0606_pre P02      zero   one       PR 
04-0605_on P03  fiftysix   one       PR 
04-0605_pre P03      zero   one       PR  
04-0632_on P02  fiftysix three       PD  
04-0632_pre P02      zero three       PD  
05-0614_on P04  fiftysix three       PR  
05-0614_pre P04      zero three       PR 
05-0621_on P03  fiftysix three       PD  
05-0621_pre P03      zero three       PD  
05-0624_on P05  fiftysix three       PR  
05-0624_pre P05      zero three       PR 
08-0631_on P04  fiftysix three       PD 
08-0631_pre P04      zero three       PD

I realize that my samples aren't well balanced between batches (only one PD response in batch one), but was under the impression that I could still incorporate batch in the design formula in this case. Could you please tell me what I'm missing here and if there is a way to incorporate batch into my analysis?

Thanks!

deseq2 • 645 views

ADD COMMENT • link updated 5.3 years ago by Michael Love 43k • written 5.3 years ago by erin.gill81 ▴ 60

score 0 · Answer 1 · 2019-08-21

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 15 hours ago

United States

I can't easily see what's going on in the colData, but if you are comparing patients to their own baseline and those are nested within batch, you don't need to add batch. You've already done the controlling when you compare to their own baseline. Likewise, when patients are compared to their own baseline, one can't add sex as a main effect, because it's already controlled at the patient level, and patients are nested within obviously.