Hi everyone,
I'm trying to make a differential binding analysis of ChIP-Seq peak data with DiffBind. I´m trying to use the code explained in the post: A: question about DiffBind, because I don't have any replicate for my samples, and I know this is not the best way to do the analysis. Anyway, when I try to run the script, I get different warning and error messages.
I have 3 samples with chip-seq experiment,they are TFa, TFb and KO, and 3 peak files for every sample from MACS2 in bed format, and an Input sample.
I'm trying to remove all the peaksets that overlap with the KO sample, and obtain the overlapping and non overlapping peaksets between TFa and TFb.
Here I paste the output from the console:
> library(DiffBind)
>
> samples <- read.csv2("Samples.csv")
> read.csv2("Samples.csv")
SampleID Tissue Factor Condition Treatment Replicate bamReads ControlID bamControl Peaks
1 CELL1 CELL1_TFa TFa Responsive LIGAND 1 TFa.bam INPUT INPUT.bam MACS2_TFa.bed
2 CELL2 CELL2_TFb TFb Responsive LIGAND 1 TFb.bam INPUT INPUT.bam MACS2_TFb.bed
3 CELL3 CELL3_KO KO Resistant LIGAND 1 KO.bam INPUT INPUT.bam MACS2_KO.bed
PeakCaller
1 bed
2 bed
3 bed
>
> diff <- dba(sampleSheet=samples)
CELL1 CELL1_TFa TFa Responsive LIGAND 1 bed
CELL2 CELL2_TFb TFb Responsive LIGAND 1 bed
CELL3 CELL3_KO KO Resistant LIGAND 1 bed
>
> diff_count <- dba.count(diff,minOverlap=1)
>
> diff_analysis <- dba.contrast(diff_count, group1=1, name1="TFa", group2=2, name2="TFb")
> diff_analysis <- dba.contrast(diff_analysis, group1=1, name1="TFa", group2=3, name2="KO")
> diff_analysis <- dba.contrast(diff_analysis, group1=2, name1="TFb", group2=3, name2="KO")
>
> diff_analysis <- dba.analyze(diff_analysis,method=DBA_DESEQ2)
Warning messages:
1: Some groups have no replicates. Results may be unreliable.
2: In dba.analyze(diff_analysis, method = DBA_DESEQ2) :
No correlation heatmap plotted -- contrast 1 does not have enough differentially bound sites.
>
> rep1 <- dba.report(diff_analysis, contrast=1, th=1, bCount=TRUE)
Error in pv.DBAreport(pv = DBA, contrast = contrast, method = method, :
edgeR analysis has not been run for this contrast
> rep2 <- dba.report(diff_analysis, contrast=2, th=1, bCount=TRUE)
Error in pv.DBAreport(pv = DBA, contrast = contrast, method = method, :
edgeR analysis has not been run for this contrast
> rep3 <- dba.report(diff_analysis, contrast=3, th=1, bCount=TRUE)
Error in pv.DBAreport(pv = DBA, contrast = contrast, method = method, :
edgeR analysis has not been run for this contrast
> read.csv2("Samples.csv")
I hope somebody could help me
Thanks in advance.
Vladimir
Without replicates, it is unlikely that you will get FDR scores indicating high confidence that any sites are differentially bound. As Gord suggests, specifying the method in
dba.report()
and settingth=1
will let you see the fold changes, which you can use to rank the sites. You could also setfold=2
, for example, to identify the sites with a large fold change. If you also setbCalled=TRUE
, you can see if the site was called as a peak by MACS as well.-R