Question: Calculate the number of SNP differences between sequences in multiple alignment
0
2.3 years ago by
European Union
lordbleys0 wrote:

Hi :)

This seems like a simple enough question but I can't find a straight answer...

I have a fasta alignment of 65 sequences, all of them exactly the same length (roughly 40 000 bp long).

The fasta alignment looks like:
>seq1
AAATTTCCCGGG
>seq2
CAATTTCCCGGG
>seq3
CAAGTTCCCGGG

The output I am looking for is the exact number of dissimilarities between each pair of sequence (Hamming distance):

seq1 seq2 seq3
seq1   0
seq2   1     0
seq3   2      1    0
etc...

I have tried importing the alignment with seqinr and using the function stringDist but it doesn't work:

snpcorealn <- read.alignment(file = "core_SNPs_matrix.fasta", format= 'fasta')
stringDist(snpcorealn, method = "hamming")

Error in .Call2("new_XStringSet_from_CHARACTER", ans_class, ans_elementType,  :
key 54 (char '6') not in lookup table

I have tried importing the alignment with Biostrings, but it doesn't work either:

snpcorealn2 <- readDNAMultipleAlignment(filepath = system.file("core_SNPs_matrix.fasta", package="Biostrings"), format = "fasta")

Error in XStringSet("DNA", x, start = start, end = end, width = width,  :
erreur d'évaluation de l'argument 'x' lors de la sélection d'une méthode pour la fonction 'XStringSet' : Error in .Call2("new_input_filexp", filepath, PACKAGE = "XVector") :
cannot open file ''

I am running out of ideas... If anyone can point me in the right direction, I don't even know why these methods are not working.

modified 2.3 years ago • written 2.3 years ago by lordbleys0
Answer: Calculate the number of SNP differences between sequences in multiple alignment
1
2.3 years ago by
Martin Morgan ♦♦ 23k
United States
Martin Morgan ♦♦ 23k wrote:

Looks like you're mixing packages -- read.alignment() is from seqinr (not a Bioconductor package...) while stringDist() is from Biostrings. So sticking with Biostrings

library(Biostrings)
stringDist(dna, method="hamming")
Answer: Calculate the number of SNP differences between sequences in multiple alignment
0
2.3 years ago by
European Union
lordbleys0 wrote:

Thank you, that worked perfectly !