Diffenrent result from EMBOSS needle and Biostrings pairwiseAlignment ?
0
0
Entering edit mode
潘晓光 • 0
@717d0f02
Last seen 24 months ago
China

When I use Biostring pairwiseAlignment, I can't get the same result with EMBOSS needle using the same parameters.

library(readr)
library(dplyr)
library(Biostrings)

EDNAFULL<-read_tsv("/run/media/panxiaoguang/myLinux/Zebrafish/EDNAFULL",col_names = T)%>%
  tibble::column_to_rownames(var="index")%>%
  as.matrix()

EDNAFULL

   A  T  G  C  S  W  R  Y  K  M  B  V  H  D  N  U
A  5 -4 -4 -4 -4  1  1 -4 -4  1 -4 -1 -1 -1 -2 -4
T -4  5 -4 -4 -4  1 -4  1  1 -4 -1 -4 -1 -1 -2  5
G -4 -4  5 -4  1 -4  1 -4  1 -4 -1 -1 -4 -1 -2 -4
C -4 -4 -4  5  1 -4 -4  1 -4  1 -1 -1 -1 -4 -2 -4
S -4 -4  1  1 -1 -4 -2 -2 -2 -2 -1 -1 -3 -3 -1 -4
W  1  1 -4 -4 -4 -1 -2 -2 -2 -2 -3 -3 -1 -1 -1  1
R  1 -4  1 -4 -2 -2 -1 -4 -2 -2 -3 -1 -3 -1 -1 -4
Y -4  1 -4  1 -2 -2 -4 -1 -2 -2 -1 -3 -1 -3 -1  1
K -4  1  1 -4 -2 -2 -2 -2 -1 -4 -1 -3 -3 -1 -1  1
M  1 -4 -4  1 -2 -2 -2 -2 -4 -1 -3 -1 -1 -3 -1 -4
B -4 -1 -1 -1 -1 -3 -3 -1 -1 -3 -1 -2 -2 -2 -1 -1
V -1 -4 -1 -1 -1 -3 -1 -3 -3 -1 -2 -1 -2 -2 -1 -4
H -1 -1 -4 -1 -3 -1 -3 -1 -3 -1 -2 -2 -1 -2 -1 -1
D -1 -1 -1 -4 -3 -1 -1 -3 -1 -3 -2 -2 -2 -1 -1 -1
N -2 -2 -2 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2
U -4  5 -4 -4 -4  1 -4  1  1 -4 -1 -4 -1 -1 -2  5

pairwiseAlignment(pattern = "GATGAACAGCAACTTTTATGAATGGGAGGGCTCTATCTTAGGGGGCAACAACACTGTTGATGTTCCCAATTACATTGCTA", 
                  subject = "GGGGAGGGCTCTATCTTAGG",
                  type = "global",
                  substitutionMatrix=EDNAFULL,
                  gapOpening=10,
                  gapExtension=0.5
)

Global PairwiseAlignmentsSingleSubject (1 of 1)
pattern: GATGAACAGCAACTTTTATGAATGGGAGGGCTCTATCTTAGGGGGCAACAACACTGTTGATGTTCCCAATTACATTGCTA
subject: G----------------------GGGAGGGCTCTATCTTAGG--------------------------------------
score: 50 

sessionInfo( )
R version 4.2.2 (2022-10-31)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Stream 8

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblaso-r0.3.15.so

locale:
 [1] LC_CTYPE=zh_CN.UTF-8       LC_NUMERIC=C               LC_TIME=zh_CN.UTF-8       
 [4] LC_COLLATE=zh_CN.UTF-8     LC_MONETARY=zh_CN.UTF-8    LC_MESSAGES=zh_CN.UTF-8   
 [7] LC_PAPER=zh_CN.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Biostrings_2.66.0   GenomeInfoDb_1.34.3 XVector_0.38.0      IRanges_2.32.0     
[5] S4Vectors_0.36.0    BiocGenerics_0.44.0 readr_2.1.3         dplyr_1.0.10       

loaded via a namespace (and not attached):
 [1] pillar_1.8.1           compiler_4.2.2         bitops_1.0-7           tools_4.2.2           
 [5] zlibbioc_1.44.0        bit_4.0.5              lifecycle_1.0.3        tibble_3.1.8          
 [9] pkgconfig_2.0.3        rlang_1.0.6            DBI_1.1.3              cli_3.4.1             
[13] rstudioapi_0.14        parallel_4.2.2         GenomeInfoDbData_1.2.9 withr_2.5.0           
[17] generics_0.1.3         vctrs_0.5.1            hms_1.1.2              bit64_4.0.5           
[21] tidyselect_1.2.0       glue_1.6.2             R6_2.5.1               fansi_1.0.3           
[25] vroom_1.6.0            tzdb_0.3.0             magrittr_2.0.3         ellipsis_0.3.2        
[29] assertthat_0.2.1       utf8_1.2.2             RCurl_1.98-1.9         crayon_1.5.2

when I use needle, I get

#=======================================
#
# Aligned_sequences: 2
# 1: 
# 2: 
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 80
# Identity:      19/80 (23.8%)
# Similarity:    19/80 (23.8%)
# Gaps:          60/80 (75.0%)
# Score: 91.0
# 
#
#=======================================

                   1 GATGAACAGCAACTTTTATGAATGGGAGGGCTCTATCTTAGGGGGCAACA     50
                                           .|||||||||||||||||||        
                   1 ----------------------GGGGAGGGCTCTATCTTAGG--------     20

                  51 ACACTGTTGATGTTCCCAATTACATTGCTA     80

                  20 ------------------------------     20


#---------------------------------------
#---------------------------------------
Alignment Biostrings • 984 views
ADD COMMENT
0
Entering edit mode

Any proferssors can answer me?

ADD REPLY
0
Entering edit mode

I am facing a similar issue.

ADD REPLY

Login before adding your answer.

Traffic: 642 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6