Diffenrent result from EMBOSS needle and Biostrings pairwiseAlignment ?
0
0
Entering edit mode
潘晓光 • 0
@717d0f02
Last seen 15 months ago
China

When I use Biostring pairwiseAlignment, I can't get the same result with EMBOSS needle using the same parameters.

library(readr)
library(dplyr)
library(Biostrings)

EDNAFULL<-read_tsv("/run/media/panxiaoguang/myLinux/Zebrafish/EDNAFULL",col_names = T)%>%
  tibble::column_to_rownames(var="index")%>%
  as.matrix()

EDNAFULL

   A  T  G  C  S  W  R  Y  K  M  B  V  H  D  N  U
A  5 -4 -4 -4 -4  1  1 -4 -4  1 -4 -1 -1 -1 -2 -4
T -4  5 -4 -4 -4  1 -4  1  1 -4 -1 -4 -1 -1 -2  5
G -4 -4  5 -4  1 -4  1 -4  1 -4 -1 -1 -4 -1 -2 -4
C -4 -4 -4  5  1 -4 -4  1 -4  1 -1 -1 -1 -4 -2 -4
S -4 -4  1  1 -1 -4 -2 -2 -2 -2 -1 -1 -3 -3 -1 -4
W  1  1 -4 -4 -4 -1 -2 -2 -2 -2 -3 -3 -1 -1 -1  1
R  1 -4  1 -4 -2 -2 -1 -4 -2 -2 -3 -1 -3 -1 -1 -4
Y -4  1 -4  1 -2 -2 -4 -1 -2 -2 -1 -3 -1 -3 -1  1
K -4  1  1 -4 -2 -2 -2 -2 -1 -4 -1 -3 -3 -1 -1  1
M  1 -4 -4  1 -2 -2 -2 -2 -4 -1 -3 -1 -1 -3 -1 -4
B -4 -1 -1 -1 -1 -3 -3 -1 -1 -3 -1 -2 -2 -2 -1 -1
V -1 -4 -1 -1 -1 -3 -1 -3 -3 -1 -2 -1 -2 -2 -1 -4
H -1 -1 -4 -1 -3 -1 -3 -1 -3 -1 -2 -2 -1 -2 -1 -1
D -1 -1 -1 -4 -3 -1 -1 -3 -1 -3 -2 -2 -2 -1 -1 -1
N -2 -2 -2 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2
U -4  5 -4 -4 -4  1 -4  1  1 -4 -1 -4 -1 -1 -2  5

pairwiseAlignment(pattern = "GATGAACAGCAACTTTTATGAATGGGAGGGCTCTATCTTAGGGGGCAACAACACTGTTGATGTTCCCAATTACATTGCTA", 
                  subject = "GGGGAGGGCTCTATCTTAGG",
                  type = "global",
                  substitutionMatrix=EDNAFULL,
                  gapOpening=10,
                  gapExtension=0.5
)

Global PairwiseAlignmentsSingleSubject (1 of 1)
pattern: GATGAACAGCAACTTTTATGAATGGGAGGGCTCTATCTTAGGGGGCAACAACACTGTTGATGTTCCCAATTACATTGCTA
subject: G----------------------GGGAGGGCTCTATCTTAGG--------------------------------------
score: 50 

sessionInfo( )
R version 4.2.2 (2022-10-31)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Stream 8

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblaso-r0.3.15.so

locale:
 [1] LC_CTYPE=zh_CN.UTF-8       LC_NUMERIC=C               LC_TIME=zh_CN.UTF-8       
 [4] LC_COLLATE=zh_CN.UTF-8     LC_MONETARY=zh_CN.UTF-8    LC_MESSAGES=zh_CN.UTF-8   
 [7] LC_PAPER=zh_CN.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Biostrings_2.66.0   GenomeInfoDb_1.34.3 XVector_0.38.0      IRanges_2.32.0     
[5] S4Vectors_0.36.0    BiocGenerics_0.44.0 readr_2.1.3         dplyr_1.0.10       

loaded via a namespace (and not attached):
 [1] pillar_1.8.1           compiler_4.2.2         bitops_1.0-7           tools_4.2.2           
 [5] zlibbioc_1.44.0        bit_4.0.5              lifecycle_1.0.3        tibble_3.1.8          
 [9] pkgconfig_2.0.3        rlang_1.0.6            DBI_1.1.3              cli_3.4.1             
[13] rstudioapi_0.14        parallel_4.2.2         GenomeInfoDbData_1.2.9 withr_2.5.0           
[17] generics_0.1.3         vctrs_0.5.1            hms_1.1.2              bit64_4.0.5           
[21] tidyselect_1.2.0       glue_1.6.2             R6_2.5.1               fansi_1.0.3           
[25] vroom_1.6.0            tzdb_0.3.0             magrittr_2.0.3         ellipsis_0.3.2        
[29] assertthat_0.2.1       utf8_1.2.2             RCurl_1.98-1.9         crayon_1.5.2

when I use needle, I get

#=======================================
#
# Aligned_sequences: 2
# 1: 
# 2: 
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 80
# Identity:      19/80 (23.8%)
# Similarity:    19/80 (23.8%)
# Gaps:          60/80 (75.0%)
# Score: 91.0
# 
#
#=======================================

                   1 GATGAACAGCAACTTTTATGAATGGGAGGGCTCTATCTTAGGGGGCAACA     50
                                           .|||||||||||||||||||        
                   1 ----------------------GGGGAGGGCTCTATCTTAGG--------     20

                  51 ACACTGTTGATGTTCCCAATTACATTGCTA     80

                  20 ------------------------------     20


#---------------------------------------
#---------------------------------------
Alignment Biostrings • 660 views
ADD COMMENT
0
Entering edit mode

Any proferssors can answer me?

ADD REPLY

Login before adding your answer.

Traffic: 737 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6