Question

diffHiC-analyzing HiC experiments prepared with different enzymes

0

Entering edit mode

roladali ▴ 20

@roladali-9193

Last seen 8.9 years ago

I would like to compare several HiC experiments, some prepared with Hind and others with Dpn. I am wondering when is the best time to merge all experiments into a single object?

> param.hind <- pairParam(fragments=hind.fragments)

> param.Dpn <- pairParam(fragments=Dpn.fragments)

> data.Dpn <- squareCounts(input.Dpn, width=1e6, param=param.Dpn)

> data.hind <- squareCounts(input.hind, width=1e6, param=param.hind)

Is it possible/reasonable to merge data.Dpn and data.hind into a data object and continue the workflow?

Thanks!

diffhic • 1.0k views

ADD COMMENT • link updated 9.1 years ago by Aaron Lun ★ 28k • written 9.1 years ago by roladali ▴ 20

score 0 · Answer 1 · 2016-06-06

It probably doesn't make sense to merge them. Using different restriction enzymes will result in different cut sites and introduce different biases. This is difficult to correct for; genomic biases aside, there's also the issue of differences in the fragment lengths and how that affects ligation efficiency. As such, it would be difficult to do any sensible direct comparisons between libraries generated with different restriction enzymes. The safest way to proceed would be to do all the quantitative analyses within each set of libraries generated with the same restriction enzyme (where the biases should cancel out), and then do meta-analyses between results from different restriction enzyme (e.g., check that the DIs detected with Hind are the same as those deteted with Dpn).

Note that the coordinates will also be slightly off for the bins; they're rounded to the nearest restriction site, which will obviously be different for the two enzymes. If you really wanted to cbind the two InteractionSet objects to merge the results at the start, you'd need to round off the ends to the nearest megabase. Then you'd need to match up elements of data.hind to data.Dpn, which can get rather painful. However, for meta-analyses, all you need to do is to identify overlaps between the two sets of DIs, so the findOverlaps method will work regardless.