Hi I was wanting to know the best practices for normalizing my Cut&Run data (Genomic localization data similar to ChIP-Seq) after degradation of my protein of interest. I am currently using Diffbind in combination with DESeq2 method of dba.analzye() but diffbind provides so many different options for dba.normalize(), I am not sure what to use. I am between either DBA_NORM_RLE or DBA_NORM_LIB. Can someone please suggest what is best practice and if I am missing anything in my code? I get very different results when using Option 1 vs Option 2 in dba.normalize(). Thank you!
#Importing the the table for diffbind containing all the necessary files, names, and locations
CTCF_narrowpeaks_table <- data.frame(SampleID=c("minusdT-A","minusdT-B","plusdT24h-A","plusdT24h-B"),Condition=c("minusdT","minusdT","plusdT24h","plusdT24h"),Treatment =c("0","0","dT","dT"), Replicate = c("1","2","1","2"),
bamReads=c("/Volumes/BombXtreme/5176_CutRun/5176_bam/5176-MB-1.hg19.F4q10.sorted.bam","/Volumes/BombXtreme/5176_CutRun/5176_bam/5176-MB-2.hg19.F4q10.sorted.bam","/Volumes/BombXtreme/5176_CutRun/5176_bam/5176-MB-7.hg19.F4q10.sorted.bam","/Volumes/BombXtreme/5176_CutRun/5176_bam/5176-MB-8.hg19.F4q10.sorted.bam"),
Peaks=c("/Volumes/BombXtreme/5176_CutRun/5176_macs/5176_macs_filterblacklist_q0.01/5176_0hA.filtered_CTCF_q0.01__peaks.narrowPeak","/Volumes/BombXtreme/5176_CutRun/5176_macs/5176_macs_filterblacklist_q0.01/5176_0hB.filtered_CTCF_q0.01__peaks.narrowPeak","/Volumes/BombXtreme/5176_CutRun/5176_macs/5176_macs_filterblacklist_q0.01/5176_24hA.filtered_CTCF_q0.01__peaks.narrowPeak","/Volumes/BombXtreme/5176_CutRun/5176_macs/5176_macs_filterblacklist_q0.01/5176_24hB.filtered_CTCF_q0.01__peaks.narrowPeak"),PeakCaller="narrow")
#Creating diffbind matrix from the table
CTCF_narrowpeaks <- dba(sampleSheet = CTCF_narrowpeaks_table)
#Counting all reads within the bam files under each peak
CTCF_narrowpeakscount <- dba.count(CTCF_narrowpeaks, minOverlap = 0 , score = DBA_SCORE_READS)
#This is where I get confused
#normalize my data Option 1:
CTCF_narrowpeaksnormalized <- dba.normalize(CTCF_narrowpeakscount, method = DBA_DESEQ2, normalize = DBA_NORM_LIB)
#normalize my data Option 2:
CTCF_narrowpeaksnormalized <- dba.normalize(CTCF_narrowpeakscount, method = DBA_DESEQ2, normalize = DBA_NORM_RLE)
#print out the normalization factors
norm <- dba.normalize(CTCF_narrowpeaksnormalized, bRetrieve=TRUE)
norm
#contrast the read counts
CTCF_narrowpeakscontrast <- dba.contrast(CTCF_narrowpeaksnormalized, categories = DBA_CONDITION, minMembers = 2)
#analyze the data with DESEQ2
CTCF_narrowpeaksanalyzed <- dba.analyze(CTCF_narrowpeakscontrast, method=DBA_DESEQ2)
Thank you Rory Stark! That helped!
Is there a way to manually input the normalization factors and will those factors be similar to DESEQ2 where you divide? How do you make dba.analyze use those normalization factors?