DiffBind: reverse absolute fold change in dba.report ie. abs(rep$Fold) 1 0 Entering edit mode @lindsaynhayes-13285 Last seen 9 weeks ago United States I generated a DBA object with multiple distinct contrasts, all with differentially accessible peaks. I want to generate a data frame with all these peaks and fold changes to plot a fold change x fold change plot for 2 of the contrasts. dat <- dba.contrast(dat, group1=dat$masks$CON.treated, group2=dat$masks$CON.saline, name1="CON.TRT", name2="CON.SAL") dat <- dba.contrast(dat, group1=dat$masks$EXP.treated, group2=dat$masks$EXP.saline, name1="EXP.TRT", name2="EXP.SAL") dat <- dba.analyze(dat, bSubControl = FALSE) dat dbSites <- dba.report(dat, bDB=TRUE)  In the dbSites$binding I get the consensus peaks of both contrasts with the fold change. (also in dbSites$peaks) CHR START END CON.TRT_vs_CON.SAL EXP.TRT_vs_EXP.SAL 1 7395534 7395831 0.82 0.87 1 9547934 9548310 0.94 1.48 1 10105145 10105502 0.87 0.00  However, these are the absolute value of the fold change abs(rep$Fold)

 head(dbSites$peaks[[1]]) Chr Start End abs(rep$Fold)
22 chr1  7395534  7395831          0.82
26 chr1  9547934  9548310          0.94
54 chr1 10105145 10105502          0.87
63 chr1 13153172 13153547          0.63


How can I get the Fold change with sign for direction? I ultimately would like to plot(dbSites$peaks[[1]]$abs(rep$Fold), dbSites$peaks[[2]]$abs(rep$Fold) but with the correct directions.

diffbind atac • 227 views
1
Entering edit mode
Rory Stark ★ 4.1k
@rory-stark-5741
Last seen 17 days ago
CRUK, Cambridge, UK

The report-based DBA object you are getting from dba.report() when you set bDB=TRUE is not really meant to be used in this way. There is no need to access internal, undocumented aspects of the DBA object when the documented interface will provided the information you need.

The preferred way would be to call dba.report() without setting bDB. This will return a GRanges object (or a data.frame if you set DataType=DBA_DATA_FRAME), which includes all the statistics and the 'Fold' with the sign. Of course, you then have to figure out the "consensus" peaks yourself.

Here's some code for doing all this:

# Retrieve stats for all the sites for both contrasts
allSites <- lapply(1:2,function(contrast){dba.report(dat,contrast, th=1)})

# Filter to include only DB sites and get "consensus"
dbSites <- lapply(allSites,function(x){x[x$FDR<0.05,]}) consensus <- union(dbSites[[1]], dbSites[[2]]) # Get list of consensus sites with statistics consSites <- lapply(allSites,function(x){sort(x[x %in% consensus,])}) # Plot Fold changes plot(consSites[[1]]$Fold,consSites[[2]]$Fold)  ADD COMMENT 0 Entering edit mode Thanks for your fast reply, Rory! The ordering is not accurate even with the sort function so the plot makes an X shape which is not right. I think it is sorting the GRanges on the seqnames but not consistently between the 2 contrasts. (Contrast 1 starts with chr4 and ends with chr10, Contrast 2 starts with chr1 and ends with chrX). So the peak information on each row isn't the same between the 2 contrasts in the consSites. I found a work around. It is not as elegant as your 5 lines, but I think this works. I forgot about th=1 that made things easier and converted to dataframes with DataType = DBA_DATA_FRAME, and luckily the row.names are consistent across both contrasts. Yeah! # without bDB=T the dba.plotVenn won't work below dbSites <- dba.report(dat, bDB = TRUE) # get the dba.report for each contrast df.con <- dba.report(dat, contrast=1, th=1, DataType = DBA_DATA_FRAME) df.mut <- dba.report(dat, contrast=2, th=1, DataType = DBA_DATA_FRAME) # merge by row.names because luckily that is consistent between both dba.reports!!! data <- merge(df.con, df.mut, by="row.names", all.x=TRUE, all.y=TRUE) data$Row.names <- as.numeric(data$Row.names) data <- data[order(data$Row.names),]

# double check the order is correct and clean up
all.equal(data$start.x, data$start.y)
all.equal(data$end.x, data$end.y)
all.equal(data$width.x, data$width.y)
all.equal(data$strand.x, data$strand.y)
data <- data[,c(-11:-13)]

# get out the sig peaks
for (i in c(1:nrow(data))){
if (data$FDR.x[i]<0.05 & data$FDR.y[i]<0.05){
data$GROUP[i]="both" } else if (data$FDR.x[i]<0.05 & data$FDR.y[i]>0.05){ data$GROUP[i]="con"}
else if (data$FDR.x[i]>0.05 & data$FDR.y[i]<0.05){
data$GROUP[i]="mut"} else { data$GROUP[i]="none"}
}
data$GROUP <- factor(data$GROUP)
table(data$GROUP) # compare with dba.plotVenn, was pretty close off by just a few. dba.plotVenn(dbSites, 1:2) ggplot(data[data$GROUP!="none",], aes(x=Fold.x, y=Fold.y, color=GROUP)) + geom_point() + theme_cowplot() + scale_color_manual(values = c("blue","black","red")) + geom_abline(slope = 1, intercept = 0) + xlim(-3,3.2) + ylim(-3,3.2)

0
Entering edit mode

Looks good! I tested my script on the sample data which only has one chromosome, so I missed this issue. I had considered merging by row.name, but I was trying to do it in fewer lines.

It looks like you have a good idea of what is going on now!