Apologies if the information is already written somewhere and I missed it.
I read the ChIPQC vignette, its manual, the ChIPQC Bioc2014 presentation, but could not find any more details about the "Reads" metrics and how the peaks numbers are calculated.
It says in the vignette (page 9) that Reads metrics corresponds to "Total number of reads in the bam file for each sample".
Here is my data obtained with the following commands:
> exp <- ChIPQC(metadata, annotation = "mm10")
Samples: 6 : samp1 samp2 ... samp5 samp6
Tissue Factor Condition Treatment Replicate Peaks
samp1 brain TF WT treat1 1 621
Reads Map% Filt% Dup% ReadL FragL RelCC SSD RiP%
samp1 2227533 100 12.9 0 51 207 1.59 0.842 0.02210
> samp1 <- QCsample(exp, 1)
Number of Mapped reads: 2227533
Number of Mapped reads passing MapQ filter: 1939371
Percentage Of Reads as Non-Duplicates (NRF): 100(0)
Percentage Of Reads in Blacklisted Regions: NA
Fragment Length Cross-Coverage: 0.000468866812610841
Relative Cross-Coverage: 1.58670988654781
Percentage Of Reads in GenomicFeature:
Percentage Of Reads in Peaks: 0.03
Number of Peaks: 8
The output from samtools flagstat is:
32041419 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
30769799 + 0 mapped (96.03%:-nan%)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (-nan%:-nan%)
0 + 0 with itself and mate mapped
0 + 0 singletons (-nan%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
So my questions would be:
1 - where do the
"Number of Mapped reads: 2227533" given by ChIPQC come from and why do they differ from samtools total number of reads?
2 - where does "
Number of Peaks: 8" come from and why does it differ from the "Peaks = 621" given in exp summary (that's what I see as well in the bed file).
3- the RIP% thus corresponds to the number of reads in the peaks N=8, or N=621?
Thanks a lot in advance for you help,
ps: thanks a lot for making this package available, it is extremely helpful!
ps2: this ChIP does not seem to have worked anyway (very low SSD, very few specific peaks etc), hence the need of thorough QC checks.