Hi Katarzyna,
I think that you need to perform a local alignment for this (type="local"). Also it looks like you'll get best results by specifying gapExtension=0:
library(Biostrings)
dna1 <-
    DNAString("NNNNNNATCGTTAAGCCCNNNNNTAGCAAAGGTAAANNNNN")
dna2 <-
    DNAString("++ATCGTTAAGCCC+++++++++++++++TAGCAAAGGTAAA+")
pa <- pairwiseAlignment(dna1, dna2, type="local",
                        gapExtension=0)
pa
# Local PairwiseAlignmentsSingleSubject (1 of 1)
# pattern: [7] ATCGTTAAGCCC---------------NNNNNTAGCAAAGGTAAA 
# subject: [3] ATCGTTAAGCCC+++++++++++++++-----TAGCAAAGGTAAA 
# score: 29.5439
See more details with:
writePairwiseAlignments(pa)
########################################
# Program: Biostrings (version 2.46.0), a Bioconductor package
# Rundate: Fri Jan 26 12:01:59 2018
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: P1
# 2: S1
# Matrix: NA
# Gap_penalty: 10.0
# Extend_penalty: 0.0
#
# Length: 45
# Identity:      25/45 (55.6%)
# Similarity:    NA/45 (NA%)
# Gaps:          20/45 (44.4%)
# Score: 29.5439
#
#
#=======================================
P1                 7 ATCGTTAAGCCC---------------NNNNNTAGCAAAGGTAAA     36
                     ||||||||||||                    |||||||||||||
S1                 3 ATCGTTAAGCCC+++++++++++++++-----TAGCAAAGGTAAA     42
#---------------------------------------
#---------------------------------------
Unfortunately, programmatically getting the ranges of the overlaps is a lot trickier than one wishes:
pattern_ins <- indel(pattern(pa))[[1]]
subject_ins <- indel(subject(pa))[[1]]
relative_gap <- union(pattern_ins, subject_ins)
stopifnot(length(relative_gap) == 1)  # only one gap is expected
first_overlap_width <- start(relative_gap) - 1
pattern_gap <- IRanges(start(ranges(pattern(pa))) + first_overlap_width,
                       width=width(subject_ins))
subject_gap <- IRanges(start(ranges(subject(pa))) + first_overlap_width,
                       width=width(pattern_ins))
pattern_overlaps <- setdiff(ranges(pattern(pa)), pattern_gap)
subject_overlaps <- setdiff(ranges(subject(pa)), subject_gap)
Then:
Views(dna1, pattern_overlaps)
#   Views on a 41-letter DNAString subject
# subject: NNNNNNATCGTTAAGCCCNNNNNTAGCAAAGGTAAANNNNN
# views:
#     start end width
# [1]     7  18    12 [ATCGTTAAGCCC]
# [2]    24  36    13 [TAGCAAAGGTAAA]
Views(dna2, subject_overlaps)
#   Views on a 43-letter DNAString subject
# subject: ++ATCGTTAAGCCC+++++++++++++++TAGCAAAGGTAAA+
# views:
#     start end width
# [1]     3  14    12 [ATCGTTAAGCCC]
# [2]    30  42    13 [TAGCAAAGGTAAA]
Hope this helps,
H.
                    
                
                 
Hi Hervé,
Thank you for the very clear answer. Do I understand correctly that your script will only find two overlapping regions?
Kind regards,
K.
Yes, the code I provided only works if there are only 2 overlapping regions that match with no indels (but mismatches should be ok). I'm afraid generalizing it to support N overlapping regions could get quite complicated...
Cheers,
H.