I am using Rsamtools to scan a BAM file generated with Bowtie2 (local alignment mode). I am interested in the insert sizes. Most of the time, everything works as expected. However, I noticed an issue with soft-clipped reads. If the fragment is smaller than read length and the reads go past the start of their mates, the reported insert size includes the soft-clipped ends and is actually bigger than the fragment.
Here is an example read pair:
A00427:5:H3CG5DSXX:1:1103:18078:10316 83 chr10 3740676 42 7S44M = 3740677 59 AGAGACAGGGGTCGACTCAGGCAGGACCTGCTAGCCCGGCGCTCCCGCCCC ,FFF:F,FFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF AS:i:88 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:44 YS:i:86 YT:Z:CP A00427:5:H3CG5DSXX:1:1103:18078:10316 163 chr10 3740677 42 43M8S = 3740676 -59 GGGTCGACTCAGGCAGGACCTGCTAGCCCGGCGCTCCCGCCCCCTGTCTCT FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF AS:i:86 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:43 YS:i:88 YT:Z:CP
The reported insert size is 59, but the actual fragment (the aligned part) is 43.
I understand Rsamtools is just reading the `TLEN` column, but is there a way to adjust these lengths or at least filter them out?