Without seeing your sample sheet or script, I can't give precise answers. I'll assume that you have both Conditions (WT and Mutant) for a single Tissue in a two separate DBA
objects T1
and T2
. (You can do this with both Tissues in a single object but the model is a bit more complex).
Addressing your third question first:
Sometimes 'gain' is gain compared to WT, and sometimes based on the
mutant. So I don't know how the program chose which one is a control
and a comparison.
How are you setting up the contrast(s)? You can control which sample group is the reference group using dba.contrast()
. You can use the reorderMeta
parameter to establish the reference value:
T1 <- dba.contrast(T1, reorderMeta=list(Condition="WT"))
T1 <- dba.contrast(T1)
Then intervals where with stronger binding in the Mutant will be Gain sites, and those in the WT will be Loss sites. Alternatively, you can set up the contrast explicitly:
T1 <- dba.contrast(T1, contrast=c("Condition", "Mutant", "WT"))
Addressing your other questions:
You can control exactly what sites and samples are included and in what order by passing them explicitly to dba.plotProfile()
.
The samples
parameter takes a specification of which samples to include, and how to group them. The numbers are the are sample numbers when you print out the DBA
object. for example, using the sample data:
> data(tamoxifen_analysis)
> tamoxifen
11 Samples, 2845 sites in matrix:
ID Tissue Factor Condition Treatment Replicate Reads FRiP
1 BT4741 BT474 ER Resistant Full-Media 1 652697 0.16
2 BT4742 BT474 ER Resistant Full-Media 2 663370 0.15
3 MCF71 MCF7 ER Responsive Full-Media 1 346429 0.31
4 MCF72 MCF7 ER Responsive Full-Media 2 368052 0.19
5 MCF73 MCF7 ER Responsive Full-Media 3 466273 0.25
6 T47D1 T47D ER Responsive Full-Media 1 399879 0.11
7 T47D2 T47D ER Responsive Full-Media 2 1475415 0.06
8 MCF7r1 MCF7 ER Resistant Full-Media 1 616630 0.22
9 MCF7r2 MCF7 ER Resistant Full-Media 2 593224 0.14
10 ZR751 ZR75 ER Responsive Full-Media 1 706836 0.33
11 ZR752 ZR75 ER Responsive Full-Media 2 2575408 0.22
Design: [~Tissue + Condition] | 1 Contrast:
Factor Group Samples Group2 Samples2 DB.DESeq2
The Responsive MCF7 samples are 3:5. It is often easier to use the built-in sample masks. For example, these same samples could be referenced as tamoxifen$masks$MCF7 & tamoxifen$masks$Responsive
. The sites will be plotted in the order you specify them (possibly merged). Note that the label you use in the specification doesn't really matter (right now) except they need to be unique.
The sites parameter takes a list of groups of sites you want to include. You can specify these using GRanges
objects, or a GRangesList
object if you want multiple sets of sites. For example, the dba.report()
function returns a GRanges
object, so you can use a report to pick out the sites you want in each group. Then the groups are plotted in the order they appear in the GRangesList
, top to bottom. If you name each GRanges
object in the GRangesList
, the name will be used as a label.
To put this all together, consider an example using the sample data:
tamoxifen$config$RunParallel <- TRUE
report <- dba.report(tamoxifen)
gain <- report[report$Fold > 0,]
loss <- report[report$Fold < 0,]
profile <- dba.plotProfile(tamoxifen,
samples=list(group1=tamoxifen$masks$Resistant,
group2=tamoxifen$masks$Responsive),
sites=GRangesList(Gain=gain,Loss=loss),
merge=c(DBA_TISSUE,DBA_REPLICATE))
dba.plotProfile(profile)
Thank you! This definitely solves my problems.