Extracting non-interacting hic reads to create a sam/bam file
2
0
Entering edit mode
@sergioespeso-gil-6997
Last seen 19 months ago
New York

Hi!

I would like to extract the not chimeric reads from hic data and create a bam file with it. I have been trying, but with not much success. I saw that maybe with DiffHic could be feasible (currently checking) or with HiCPro that generates a sam file with the chimeric reads (I need to know how to subtract this from the original sam/bam file).

So, I would like to know if some of you have ideas or suggestions.

Thanks a lot!

S.

diffhic hicpro hic • 752 views
2
Entering edit mode
Aaron Lun ★ 27k
@alun
Last seen 5 hours ago
The city by the bay

I usually think of chimeric reads as those where one end of the read aligns to one place, and the other end of the same read aligns somewhere else, i.e., the ligation junction is in the middle of the read. If you are interested (or not) in those reads, the mapping script in diffHic will label their CIGAR strings with hard clips ("H") in the BAM file. You can then parse the BAM file (e.g., with pysam) and discard/select such reads.

On the other hand, you may be referring to read pairs that correspond to a single DNA fragment that is not a ligation product, i.e., dangling ends or self-circles. diffHic can identify these read pairs, but only for the purpose of removing them prior to the differential analysis. No function exists to record the characteristics of the discarded read pairs, though you could hack something together with loadData and diffHic:::.getStats. And there is absolutely no support for creating a new BAM file, you will have to do that yourself with pysam.

0
Entering edit mode
@sergioespeso-gil-6997
Last seen 19 months ago
New York

Thanks! That was really useful actually! :-)