I have 15 protein sequences of 99 amino acids each. After doing some looking around I have found that there are several ways you can read sequences into R and do pairwise or multiple alignments(e.g. seqinr and biostrings). I, however, do not know how to probe changes at specific positions. For instance, I would like to know the best way to align a standard sequence with several mutant sequences and probe each amino acid position that does not match the standard sequence. In other words seq1 = "standard amino acid seq" and seq2 = "mutant seq", align these 2 and then have a way to ask R to report whether there is a change at position 10, or 11, or 12 and so on such that R reports(for example) TRUE or FALSE for this question. Where all the sequences that have a reported TRUE for a change at position X can be grouped against those that do not have a change at this position. Is it possible to index on a singular position as with a normal vector in R? Can I use the basic == binary operator in this case to compare?
A lofty goal for me would be to write a program that would loop through a local alignment of positions, sorting the mutants into y/n or T/F groups based on the presence of a change at a certain position.
I'm not even sure that R is the best way to do this, but it's the only language I'm somewhat familiar with. I hope this makes sense. Any help will be appreciated.
The reason why I want to do this in R is so that once the mutants are sorted into their basic y/n groupings based on the presence of a change at a certain position, I would like to carry out some statistical tests between the values of corresponding principal components that are computed.