Interpreting strand orientation plots in diffHic
Hi! I've got two sets of replicate (x3) HiC libraries. I think I've set the min.inward,  min.outward and max.frag thresholds for them correctly (or at least for the bulk of the datasets - not sure if I should set the threshold for replicate 3 individually?).

My question is: what is going on with these HiC libraries? Why do these peak heights look so different? Are any of the libraries "better" than the other ones? Are they usable?

Aaron Lun ★ 27k
Looks okay to me. You might want to trim your outward pairs at 100 kbp, just to get rid of the part where the red line is still above the blue line. I'm not sure what the enrichment of the same-strand (blue) pairs at a log-distance of 10-15 represents, my best guess is that it's a consequence of homologous pairing - I don't know whether that's biologically feasible, but this shouldn't really matter so long as it's not a technical artefact.

Anyway, I would proceed with the data analysis. The relative heights of the peaks aren't too problematic as long as you remove the inward and outward-facing reads. Yes, there will still be some systematic differences between libraries corresponding to the differences in the orientation profiles (probably resulting from differences in ligation/cross-linking efficiency), but as long as you normalize correctly later you should be fine.

Thanks!

OK, I've set the cutoff for outward pairs higher.

The three replicates are paired, in that Sample1"left" and Sample2"left' were generated at the same time, as were Sample1 and 2 "middle", and Sample 1 -2 "right". So that's why I think there could be some "wet-lab" variability (i.e. minor batch effects) between the three samples. The caveat is that I would hope that our signal would be robust enough to be stronger than these technical effects.

With normalisation, apart from the MA-plots, which other visualisations/metrics can I use to assess that the normalisation is occurring correctly?

I guess you could make some distance plots to check that the libraries have similar trends in the count size with respect to increasing distance between interacting loci. This should already be the case if you've normalised based on trends in the average abundance with normOffsets, given that distance is a monotonic function of the abundance.