Question: Diffbind: blocking factors in a contrast not including all samples (bug?)
0
gravatar for Jon Manning
3.3 years ago by
Jon Manning90
Jon Manning90 wrote:

Hi all,

I have a fairly complex ChIP-seq experiment with lots of treated/untreated groups, 2 replicates each. All the 'replicate 1s' were generated together, separately to the 'replicate 2s' and I can see that that effect is quite pronounced in the final matrices, to the extent that it's stronger than my treatment effects. I'd like to use DiffBind's blocking function to account for that.

The contrasts I wish to conduct all involve a subset of the data- which works fine without blocking. Say 'diffbind_count' is a DBA on which I've conducted the couting step, the regular contrast works fine:

 >unname(diffbind_count$masks[['group2']])
[1] FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE
> unname(diffbind_count$masks[['group1']])
 [1]  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE

>cont_db <- dba.contrast(diffbind_count, diffbind_count$masks$group2, diffbind_count$masks$group1, 'group2', 'group1')

But if I then try to add a blocking factor, I get an error:

>batches
[[1]]
 [1]  1  2  3  4  9 10 11 16 17 18 19 24 25 26 27

[[2]]
 [1]  5  6  7  8 12 13 14 15 20 21 22 23 28 29 30
> cont_db <- dba.contrast(diffbind_count, diffbind_count$masks$group2, diffbind_count$masks$group1, 'group2', 'group1', block = batches)

Warning message:
In unique + (contrast$group1 & att$samples) :
  longer object length is not a multiple of shorter object length

I did a bit of digging, and it seems that, in the pv.checkBlock() function, the initial setting of 'unique' in the indicated code segment creates a vector of the same length as the number of samples in the contrast:

>unique <- rep(0,sum(contrast$group1)+sum(contrast$group2))     

But then subsequent calls going through the blocking list use 'group1' from the contrast etc, which are the same length as the total number of samples:

>    for(att in contrast$blocklist) {
      unique <- unique + (contrast$group1 & att$samples)
      unique <- unique + (contrast$group2 & att$samples)    
    }

... which produces the above error when it tries to sum a vector of 4 with a much longer one.

Is this a bug, or have I specified something incorrectly?

Is the workaround to subset the DBA object to just those samples involved in the contrast before the call to dba.contrast()?:

cont_db <- dba(diffbind_count, mask = diffbind_count$masks$group2 | diffbind_count$masks$group1)
cont_db$samples <- cont_db$samples[rownames(cont_db$samples) %in% colnames(cont_db$vectors),]

This seems to work, but I'm not sure if has any ramifcations, e.g. not fitting the batch covariate across the entire dataset.

Thanks for any pointers,

Jon

 

diffbind batch effect • 462 views
ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by Jon Manning90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 406 users visited in the last hour