Hi,
I'm analyzing my chipseq data and am encountering some issues. I'm new to chipseq so any help is appreciated. Here is my pipeline:
I have triplicates for my treatment and triplicates for control.
1) Align filtered & trimmed Illumina fastq (paired end) to ref genome using bowtie. 2) Convert SAM file to sorted BAM file using samtools. 3) Use MACS2 to call peaks. This was done on Galaxy and I used all standard parameters except for changing lower mfold to 1 because I wanted to make it less specific just to see more peaks. 4) The output showed 5 peaks, I converted the Peaks.txt file to csv and used that as input for next step. 5) I used the above .cvs file for DiffBind in R, code below:
peaks <- dba.analyze("tS_cM.csv")
The error I'm getting is:
Loading sample sheet... 1 NA raw 2 NA raw 3 NA raw 4 NA raw 5 NA raw 6 NA raw 7 NA raw Error: DBA object has no peaks.
I've also included the txt version of my .csv file.
# This file is generated by MACS version 2.2.9.1
# Command line: callpeak -t /data/dnb10/galaxy_db/files/7/5/9/dataset_759c3204-8cd2-4a01-9d1a-625a880c13b9.dat /data/dnb10/galaxy_db/files/f/8/3/dataset_f8366859-d501-4460-b617-6d716216d811.dat /data/dnb10/galaxy_db/files/8/f/a/dataset_8fa3c242-c607-44f0-8cad-6738b0e8f681.dat --name S1_sorted_bam -c /data/dnb10/galaxy_db/files/3/c/b/dataset_3cb5e1c5-e77d-439c-91a0-537f8f4ea857.dat /data/dnb10/galaxy_db/files/c/c/7/dataset_cc784e5b-2cc9-458b-8daf-e226ed2c59fa.dat --format BAMPE --gsize 3700000 --keep-dup 1 --d-min 20 --buffer-size 100000 --bdg --qvalue 0.05 --mfold 1 50 --bw 300
# ARGUMENTS LIST:
# name = S1_sorted_bam
# format = BAMPE
# ChIP-seq file = ['/data/dnb10/galaxy_db/files/7/5/9/dataset_759c3204-8cd2-4a01-9d1a-625a880c13b9.dat', '/data/dnb10/galaxy_db/files/f/8/3/dataset_f8366859-d501-4460-b617-6d716216d811.dat', '/data/dnb10/galaxy_db/files/8/f/a/dataset_8fa3c242-c607-44f0-8cad-6738b0e8f681.dat']
# control file = ['/data/dnb10/galaxy_db/files/3/c/b/dataset_3cb5e1c5-e77d-439c-91a0-537f8f4ea857.dat', '/data/dnb10/galaxy_db/files/c/c/7/dataset_cc784e5b-2cc9-458b-8daf-e226ed2c59fa.dat']
# effective genome size = 3.70e+06
# band width = 300
# model fold = [1, 50]
# qvalue cutoff = 5.00e-02
# The maximum gap between significant sites is assigned as the read length/tag size.
# The minimum length of peaks is assigned as the predicted fragment length "d".
# Larger dataset will be scaled towards smaller dataset.
# Range for calculating regional lambda is: 1000 bps and 10000 bps
# Broad region calling is off
# Paired-End mode is on
# fragment size is determined as 210 bps
# total fragments in treatment: 613205
# fragments after filtering in treatment: 544611
# maximum duplicate fragments in treatment = 1
# Redundant rate in treatment: 0.11
# total fragments in control: 565624
# fragments after filtering in control: 489201
# maximum duplicate fragments in control = 1
# Redundant rate in control: 0.14
# d = 210
chr start end length abs_summit pileup -log10(pvalue) fold_enrichment -log10(qvalue) name
NC_014034.1 513165 513418 254 513402 62 6.11744 1.91471 2.57173 S1_sorted_bam_peak_1
NC_014034.1 3003464 3003706 243 3003490 67 5.96006 1.84696 2.52564 S1_sorted_bam_peak_2
NC_014034.1 3118257 3118497 241 3118334 75 3.97106 1.55513 1.32946 S1_sorted_bam_peak_3
NC_014034.1 3399189 3399404 216 3399353 70 8.24044 2.09495 2.6286 S1_sorted_bam_peak_4
NC_014034.1 3409689 3409913 225 3409745 68 5.66339 1.80183 2.36274 S1_sorted_bam_peak_5
NC_014034.1 3490720 3490940 221 3490816 67 6.15106 1.87075 2.58832 S1_sorted_bam_peak_6
NC_014034.1 3637800 3638077 278 3637887 65 4.68416 1.70072 1.77008 S1_sorted_bam_peak_7
Why isn't DiffBind identifying these peaks? Thanks in advance!