Diffbind: blocking factors in a contrast not including all samples (bug?)
0
0
Entering edit mode
Jon Manning ▴ 90
@jon-manning-5708
Last seen 7.7 years ago

Hi all,

I have a fairly complex ChIP-seq experiment with lots of treated/untreated groups, 2 replicates each. All the 'replicate 1s' were generated together, separately to the 'replicate 2s' and I can see that that effect is quite pronounced in the final matrices, to the extent that it's stronger than my treatment effects. I'd like to use DiffBind's blocking function to account for that.

The contrasts I wish to conduct all involve a subset of the data- which works fine without blocking. Say 'diffbind_count' is a DBA on which I've conducted the couting step, the regular contrast works fine:

 >unname(diffbind_count$masks[['group2']])
[1] FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE
> unname(diffbind_count$masks[['group1']])
 [1]  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE

>cont_db <- dba.contrast(diffbind_count, diffbind_count$masks$group2, diffbind_count$masks$group1, 'group2', 'group1')

But if I then try to add a blocking factor, I get an error:

>batches
[[1]]
 [1]  1  2  3  4  9 10 11 16 17 18 19 24 25 26 27

[[2]]
 [1]  5  6  7  8 12 13 14 15 20 21 22 23 28 29 30
> cont_db <- dba.contrast(diffbind_count, diffbind_count$masks$group2, diffbind_count$masks$group1, 'group2', 'group1', block = batches)

Warning message:
In unique + (contrast$group1 & att$samples) :
  longer object length is not a multiple of shorter object length

I did a bit of digging, and it seems that, in the pv.checkBlock() function, the initial setting of 'unique' in the indicated code segment creates a vector of the same length as the number of samples in the contrast:

>unique <- rep(0,sum(contrast$group1)+sum(contrast$group2))     

But then subsequent calls going through the blocking list use 'group1' from the contrast etc, which are the same length as the total number of samples:

>    for(att in contrast$blocklist) {
      unique <- unique + (contrast$group1 & att$samples)
      unique <- unique + (contrast$group2 & att$samples)    
    }

... which produces the above error when it tries to sum a vector of 4 with a much longer one.

Is this a bug, or have I specified something incorrectly?

Is the workaround to subset the DBA object to just those samples involved in the contrast before the call to dba.contrast()?:

cont_db <- dba(diffbind_count, mask = diffbind_count$masks$group2 | diffbind_count$masks$group1)
cont_db$samples <- cont_db$samples[rownames(cont_db$samples) %in% colnames(cont_db$vectors),]

This seems to work, but I'm not sure if has any ramifcations, e.g. not fitting the batch covariate across the entire dataset.

Thanks for any pointers,

Jon

 

diffbind batch effect • 796 views
ADD COMMENT

Login before adding your answer.

Traffic: 827 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6