Entering edit mode
When I use Biostring pairwiseAlignment, I can't get the same result with EMBOSS needle using the same parameters.
library(readr)
library(dplyr)
library(Biostrings)
EDNAFULL<-read_tsv("/run/media/panxiaoguang/myLinux/Zebrafish/EDNAFULL",col_names = T)%>%
tibble::column_to_rownames(var="index")%>%
as.matrix()
EDNAFULL
A T G C S W R Y K M B V H D N U
A 5 -4 -4 -4 -4 1 1 -4 -4 1 -4 -1 -1 -1 -2 -4
T -4 5 -4 -4 -4 1 -4 1 1 -4 -1 -4 -1 -1 -2 5
G -4 -4 5 -4 1 -4 1 -4 1 -4 -1 -1 -4 -1 -2 -4
C -4 -4 -4 5 1 -4 -4 1 -4 1 -1 -1 -1 -4 -2 -4
S -4 -4 1 1 -1 -4 -2 -2 -2 -2 -1 -1 -3 -3 -1 -4
W 1 1 -4 -4 -4 -1 -2 -2 -2 -2 -3 -3 -1 -1 -1 1
R 1 -4 1 -4 -2 -2 -1 -4 -2 -2 -3 -1 -3 -1 -1 -4
Y -4 1 -4 1 -2 -2 -4 -1 -2 -2 -1 -3 -1 -3 -1 1
K -4 1 1 -4 -2 -2 -2 -2 -1 -4 -1 -3 -3 -1 -1 1
M 1 -4 -4 1 -2 -2 -2 -2 -4 -1 -3 -1 -1 -3 -1 -4
B -4 -1 -1 -1 -1 -3 -3 -1 -1 -3 -1 -2 -2 -2 -1 -1
V -1 -4 -1 -1 -1 -3 -1 -3 -3 -1 -2 -1 -2 -2 -1 -4
H -1 -1 -4 -1 -3 -1 -3 -1 -3 -1 -2 -2 -1 -2 -1 -1
D -1 -1 -1 -4 -3 -1 -1 -3 -1 -3 -2 -2 -2 -1 -1 -1
N -2 -2 -2 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2
U -4 5 -4 -4 -4 1 -4 1 1 -4 -1 -4 -1 -1 -2 5
pairwiseAlignment(pattern = "GATGAACAGCAACTTTTATGAATGGGAGGGCTCTATCTTAGGGGGCAACAACACTGTTGATGTTCCCAATTACATTGCTA",
subject = "GGGGAGGGCTCTATCTTAGG",
type = "global",
substitutionMatrix=EDNAFULL,
gapOpening=10,
gapExtension=0.5
)
Global PairwiseAlignmentsSingleSubject (1 of 1)
pattern: GATGAACAGCAACTTTTATGAATGGGAGGGCTCTATCTTAGGGGGCAACAACACTGTTGATGTTCCCAATTACATTGCTA
subject: G----------------------GGGAGGGCTCTATCTTAGG--------------------------------------
score: 50
sessionInfo( )
R version 4.2.2 (2022-10-31)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Stream 8
Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblaso-r0.3.15.so
locale:
[1] LC_CTYPE=zh_CN.UTF-8 LC_NUMERIC=C LC_TIME=zh_CN.UTF-8
[4] LC_COLLATE=zh_CN.UTF-8 LC_MONETARY=zh_CN.UTF-8 LC_MESSAGES=zh_CN.UTF-8
[7] LC_PAPER=zh_CN.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] Biostrings_2.66.0 GenomeInfoDb_1.34.3 XVector_0.38.0 IRanges_2.32.0
[5] S4Vectors_0.36.0 BiocGenerics_0.44.0 readr_2.1.3 dplyr_1.0.10
loaded via a namespace (and not attached):
[1] pillar_1.8.1 compiler_4.2.2 bitops_1.0-7 tools_4.2.2
[5] zlibbioc_1.44.0 bit_4.0.5 lifecycle_1.0.3 tibble_3.1.8
[9] pkgconfig_2.0.3 rlang_1.0.6 DBI_1.1.3 cli_3.4.1
[13] rstudioapi_0.14 parallel_4.2.2 GenomeInfoDbData_1.2.9 withr_2.5.0
[17] generics_0.1.3 vctrs_0.5.1 hms_1.1.2 bit64_4.0.5
[21] tidyselect_1.2.0 glue_1.6.2 R6_2.5.1 fansi_1.0.3
[25] vroom_1.6.0 tzdb_0.3.0 magrittr_2.0.3 ellipsis_0.3.2
[29] assertthat_0.2.1 utf8_1.2.2 RCurl_1.98-1.9 crayon_1.5.2
when I use needle, I get
#=======================================
#
# Aligned_sequences: 2
# 1:
# 2:
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 80
# Identity: 19/80 (23.8%)
# Similarity: 19/80 (23.8%)
# Gaps: 60/80 (75.0%)
# Score: 91.0
#
#
#=======================================
1 GATGAACAGCAACTTTTATGAATGGGAGGGCTCTATCTTAGGGGGCAACA 50
.|||||||||||||||||||
1 ----------------------GGGGAGGGCTCTATCTTAGG-------- 20
51 ACACTGTTGATGTTCCCAATTACATTGCTA 80
20 ------------------------------ 20
#---------------------------------------
#---------------------------------------
Any proferssors can answer me?
I am facing a similar issue.