Error in pairwiseAlignment when considering an empty seuqnece
1
0
Entering edit mode
@kristoffervittingseerup-7310
Last seen 4.2 years ago
European Union

The pairwiseAlignment() function have a problematic way of handling empty sequences illustrated by this toy example:

library(Biostrings)

seq1 <- DNAString('ATCG')
seq2 <- DNAString('')

tmp <- pairwiseAlignment(seq1, seq2)
class(tmp)
[1] "PairwiseAlignmentsSingleSubject"
attr(,"package")
[1] "Biostrings"

tmp
Global PairwiseAlignmentsSingleSubject (1 of 1)
Error in rep(" ", patternSpaces) : invalid 'times' argument
In function (object)  : NaNs produced

My inituition would be that pairwiseAlignment it simply repported no overlap and a score of NA.

Btw this error crashes RStudio-

biostrings • 666 views
0
Entering edit mode

Hi Kristoffer,

This error crashes my R session (and I'm not using RStudio, just R at the Unix command line in a terminal). Will investigate. Thanks for the report.

H.

0
Entering edit mode
@herve-pages-1542
Last seen 11 hours ago
Seattle, WA, United States

Hi Kristoffer,

Sorry that it took so long but this was a low-priority issue. It should be fixed in Biostrings 2.47.3:

library(Biostrings)
pattern <- DNAString("ATCG")
subject <- DNAString("")
pairwiseAlignment(pattern, subject)
# Global PairwiseAlignmentsSingleSubject (1 of 1)
# pattern: [1] ""
# subject: [1] ""
# score: -26 

The score of -26 reflects the cost of the gap introduced by the full deletion required to globally align "ATCG" to "". Given that the default values of gapOpening and gapExtension are 10 and 4, respectively, the cost of this gap is gapOpening + 4 * gapExtension = 26.

Biostrings 2.47.3 should become available via biocLite() to Bioconductor devel users in the next 48 hours or so.

Cheers,

H.