Search
Question: How to get specific fragment/read interactions from HiC (diffHiC) data?
0
8 months ago by
hwick120
hwick120 wrote:

Hello,

I have gone through diffHiC to get significantly interacting bins in my data set. Next, we want to be able to confirm some of these interactions with PCR. To do this properly, I need to be able to see what specific fragments or reads in our significant bins that are doing the actual interacting. Preferably, I'd like to get the pairs of actual RE sites that are ligated together, but the ligated fragments will do. Is there an easy way to extract this information from one of the data objects? I was going to check out connectCounts and feed it the fragments in the significant bins as regions, but would appreciate any input if anyone has done something similar.

Thank you,

~ Heather

modified 8 months ago by Aaron Lun20k • written 8 months ago by hwick120
0
8 months ago by
Aaron Lun20k
Cambridge, United Kingdom
Aaron Lun20k wrote:

The easy way is to use extractPatch with width=1, which should effectively pull out counts for interactions between pairs of specific restriction fragments.

The harder way, if you want the original read positions, is to use loadData for a specific chromosome pair, and then identify the rows with anchor indices that point to the desired restriction fragments in param$fragments. From there, you should be able to figure out which ends are ligating to each other. I must admit that I've never done this procedure, which is why it requires some manual work. I may add it if it seems useful, but most interactions span several kilobases at least (and usually several fragments as well) so base pair resolution was unnecessary in most cases. But I guess if you're spending good money ($100?) to design some primers, you might as well do it on the ligation products that you're most likely to observe...

Thank you for the speedy response! It seems that extractPatch is a function that is not in the version of diffHiC that I have installed (1.6.0); is github devtools the only method to download the updated diffHiC to get this functionality?

You should always be upgrading to the latest release version of packages - for diffHic, this is 1.8.something. biocLite() should automatically prompt you to upgrade if you're running the latest version of R.

Thank you, apparently I had some other technical issue that was preventing the update from going through, but I've gotten it to update now. Unfortunately I am getting an error in extractPatch and I can't find what's going wrong:


> hs.frag <- cutGenome(BSgenome.Hsapiens.UCSC.hg19, "AAGCTT", 4) #HindIII

> hs.param <- pairParam(hs.frag)

> diagnostics_Ctrl <- preparePairs("Ctrl_presplit_fixed_mate_marked_duplicates_name_sorted.bam", hs.param, file="Ctrl.h5", dedup=TRUE, minq=10)

> test<-extractPatch(diagnostics_Ctrl, hs.param, sig_bins_20kb_ordered_grange_a1[1], sig_bins_20kb_ordered_grange_a2[1], width=1)

Error in path.expand(y) : invalid 'path' argument


sig_bins_20kb_ordered_grange_a1 and sig_bins_20kb_ordered_grange_a2 are both grange objects containing the anchors 1 and 2, respectively, of significant interactions

I took a look at the code for extractPatch up on github but I can't seem to find what prompts this error.

The input to extractPatch should be the HDF5 file, i.e., "Ctrl.h5".

Success! Thank you!

One further question, now that I am looking at the results. Am I correct that extractPath only lists the pairs of RE fragments that are interacting? And not the number of times those fragments were found to interact? It seems that entries only occur once, with no "count" of the number of times that they were found to interact. Is that correct? Or is this just a coincidence that all my interactions only occur once?

There is a count for each interaction, representing the number of read pairs between the corresponding restriction fragments. This is accessible by running assay() on the output object; read ?extractPatch.