DESeq2: Error when correcting batch effect
1
0
Entering edit mode
Rimma • 0
@rimma-21441
Last seen 4.8 years ago

Hello, I'm struggling with batch correction for RNA-seq data in DESeq2. For example, my colData looks like this (10 samples, 6 controls+4 treatment, belong to 2 batches):

samples   condition batch    
    100        PH7     1
    101        PH7     1
    103        PH7     1
    63         PH7     1
    64         ctr     1
    74         ctr     1
    75         ctr     1
    76         ctr     2
    88         ctr     2
    99         ctr     2

As far as I understood from this post, my problem is that some conditions belongs only to one batch, for example, all "PH7" belong only to 1 batch. I tried to do as was suggested on the post:

mm = model.matrix(~ batch+conditions, colData(dds))

And then look up for columns where ALL zeros, however, I don't have such... At least in one raw of each column there is 1.

Is there a way to make such analysis?

deseq2 batch effect • 1.0k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 16 hours ago
United States

You can just use ~batch + condition here. What is the error?

ADD COMMENT
0
Entering edit mode

I tried, it shows this one:

  Error in checkFullRank(modelMatrix) : 
  the model matrix is not full rank, so the model cannot be fit as specified.
  One or more variables or interaction terms in the design formula are linear
  combinations of the others and must be removed.
ADD REPLY
0
Entering edit mode

I don't get that error when I run this design and this column data. Maybe check your code?

dds <- makeExampleDESeqDataSet(m=10)
dds$batch <- factor(rep(1:2,c(7,3)))
dds$condition <- factor(rep(2:1,c(4,6)))
design(dds) <- ~ batch + condition
dds <- DESeq(dds)
ADD REPLY
0
Entering edit mode

Thank you for reply Michael!

I a bit simplified colData for post, but does it make changes if my actual colData looks like this (so the major difference I see is that the third batch has all conditions which don't belong to any other batches):

samples   condition batch    
    100        PH7     1
    101        PH7     1
    103        PH7     1
    63         PH7     1
    64         ctr     1
    74         ctr     1
    75         ctr     1
    76         ctr     2
    88         ctr     2
    99         ctr     2
   11         hbls     3
   12         hbls     3
   13         hbls     3

Otherwise, my code looks fine to me, but I will recheck it again

ADD REPLY
0
Entering edit mode

Yes it makes a difference. This is why it's good to try to describe your actual data, so we don't go back and forth while talking about different datasets.

In your actual dataset, you can't control for batch effects because your batch 3 is confounded with your condition there. This means that your results cannot be trusted entirely, regardless of what statistical method you use, because you can't tell batch 3 apart from that condition.

While this doesn't solve that particular problem, my preferred approach to deal with the two batches within control at this point would be to use SVA to capture heterogeneity that is orthogonal to the condition. We have example code in the workflow on how to do this.

ADD REPLY
0
Entering edit mode

Sorry for this.

Yes,I understand the problem now...

Thank you for clarifications :)

ADD REPLY

Login before adding your answer.

Traffic: 727 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6