Entering edit mode
Giuseppe Gallone
▴
170
@giuseppe-gallone-6092
Last seen 10.2 years ago
Hi
I have a group of samples for which I'd like to ascertain if
differential binding is detectable based on a "condition" binary
variable (stored in DBA_CONDITION).
However, these samples have been processed in 4 batches (each batch
has
at least 3 samples). I would like to run a multifactorial analysis to
regress the batch effect first, and then possibly analyse any
remaining
variance across the DBA_CONDITION contrast of interest.
I understand it is possible to run such an analysis using blocking
factors in dba.contrast. Let's say I store the 4 batch labels in
DBA_TISSUE. The following:
data = dba.contrast(data, categories=DBA_CONDITION, block=DBA_TISSUE)
returns the following warning messages:
Warning messages:
1: Blocking factor invalid for all contrasts:
2: No blocking values are present in both groups
and data will not contain blocking factor information.
Am I wrong in thinking that multiple contrasts can be used for the
"block" argument? If I use only one contrast via mask (for example
BATCH_1 VS !BATCH_1) this works correctly:
data = dba.contrast(data, categories=DBA_CONDITION,
block=data$masks$BATCH_1)
however it will only block variance due to to this particular
contrast,
not all of them.
A solution is, I suppose, do a differential analysis on all the
contrasts one wishes to block, and identify the one which produces the
highest number of variant sites:
data = dba.contrast(data, categories=DBA_TISSUE)
dba.analyze(data)
...
#pick the contrast with the highest variance, eg BATCH_4, then do:
data = dba.contrast(data, categories=DBA_CONDITION,
block=data$masks$BATCH_4)
However I was still wondering if there is a way to model all the
variance due to the batch effects at once and the look at the residual
variance for the real analysis.
Thanks!
Giuseppe