DiffBind handling of replicates and consensus
2
0
Entering edit mode
Jim • 0
@jim-22844
Last seen 4.8 years ago

I am currently working on an analysis of two factors with two replicates each. I have called peaks using three different peaks callers, and would like to first derive a consensus of the peak callers for each factor before generating a consensus across the factors. Is there any way i can achieve this without having to create multiple DBA's?

My current workflow is: - create DBA with ALL samples, replicates and peak calls (2 x 2 x 3 = 12) - generate consensus for each factor - export and save consensus peaks as bed file - create new DBA with ALL samples, but using consensus peaks for appropriate factor - count - analyse - plot data ...

diffbind • 1.7k views
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 5 weeks ago
Cambridge, UK

If you gave all the same bam files for the same samples, you might try the following:

myDBA <- dba(samplesheet=...)
consDBA <- dba.peakset(myDBA,consensus=c(DBA_FACTOR,DBA_REPLICATE))
consDBA <- dba(consDBA,mask=consDBA$masks$Consensus)
consDBA <- dba.count(consDBA)

If that doesn't work, you can set the bam files directly before counting:

consDBA$class[10,1] <- "path_to_first_bam_file.bam"
consDBA$class[10,2] <- "path_to_second_bam_file.bam"
consDBA <- dba.count(consDBA)
ADD COMMENT
0
Entering edit mode

Hi Rory Your advice is very useful. But now I want to use the consensus bed file generated by WT group and KO group respectively for the next step of analysis. I will footprinting analysis using the rgt-hint. He asked me to generate a bed file with an integer score for the fifth column. I wonder whether diffbind can extract the two consensus bed files. And the fifth column is an integer. now i had run the code below

dbObj_consensus<-dba(DBA = dbObj_consensus, mask = dbObj_consensus$masks$Consensus,minOverlap = 1)
dbObj_consensus

i get the output

2 Samples, 66064 sites in matrix:
  ID Tissue Factor Replicate Intervals
1 KO Spleen     KO   1-2-3-4     63479
2 WT Spleen     WT   1-2-3-4     57950

I want to get KO's consensus bed which contain 63479 lines and WT's consensus bed which contain 57950 lines Thank you

ADD REPLY
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 5 weeks ago
Cambridge, UK

Try this:

consensusKO <- dba.peakset(dbObj_consensus, peaks=1, bRetrieve=TRUE)
consensusWT <- dba.peakset(dbObj_consensus, peaks=2, bRetrieve=TRUE)
ADD COMMENT
0
Entering edit mode

Thanks for your help. I'm sorry, I have another question: I'm using diffbind to analyze my ATACseq data. When I use a old version(R base is 3.6.2, diffbind is 3.5). my code is

m1 <- "KO"
m2 <- "WT"
dbObj <- dba.count(dbObj, bUseSummarizeOverlaps=TRUE, minOverlap = 0.5)
contr <- paste0(m1, "vs", m2)
dbObj <- dba.contrast(dbObj, group1 = dbObj$masks[[m1]], group2 = dbObj$masks[[m2]])
dbObj <- dba.analyze(dbObj, bFullLibrarySize = FALSE)
dba.show(dbObj, bContrasts=TRUE)

i can get 2500 differential binding site

but when i use R4.2.2 and diffbind is 3.8.0 because this version delete the parameter bFullLibrarySize, my code is :

m1 <- "KO"
m2 <- "WT"
bObj <- dba.count(dbObj, bUseSummarizeOverlaps=TRUE,minOverlap = 0.5,bParallel = 8)
dbObj <- dba.normalize(DBA = dbObj, library = "RiP")
contr <- paste0(m1, "vs", m2)
dbObj <- dba.contrast(dbObj, group1 = dbObj$masks[[m1]], group2 = dbObj$masks[[m2]])
dbObj <- dba.analyze(dbObj)
dba.show(dbObj, bContrasts=TRUE)

I just get 552 differential differential binding site.

I have seen you reply about bFullLibrarySize. like your reply 1 and your reply 2. you said that If bFullLibrarySize=FALSE, the number of reads that overlap consensus peaks is used for each sample (basically, the sum of all the counts). So I used dbObj <- dba.normalize(DBA = dbObj, library = "RiP") to make sure that the method of normalize is consistent.

I known you don't recommand using bFullLibrarySize = FALSE. But the reason I use it is that it filters out more significant peaks at FDR<0.05, which helps me not miss any information that might be useful.

My question is how can I reproduce the results obtained from the old version diffbind ? Thank you!

ADD REPLY
0
Entering edit mode

You may want to try different values for the normalize parameter:

dbObj <- dba.normalize(DBA = dbObj, library = "RiP", normalize=DBA_NORM_LIB)

vs.

dbObj <- dba.normalize(DBA = dbObj, library = "RiP", normalize=DBA_NORM_RLE)
ADD REPLY

Login before adding your answer.

Traffic: 523 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6