suggested Biostrings library function for printing an alignment
0
0
Entering edit mode
@coghlan-avril-3810
Last seen 10.3 years ago
Dear Bioconductor users and developers, I am using the Biostrings library for creating alignments, especially the pairwiseAlignment() function, which is very useful. I would like to suggest a new function for the Biostrings library, for printing out a pairwise alignment in blocks. I have looked, and there doesn't seem to be such a function at present in Biostrings or any other R library (I have tried to find one). Here is my suggested function: > printPairwiseAlignment <- function(alignment, chunksize) { seq1aln <- pattern(alignment) # Get the alignment for the first sequence seq2aln <- subject(alignment) # Get the alignment for the second sequence alnlen <- nchar(seq1aln) # Find the number of columns in the alignment starts <- seq(1, alnlen, by=chunksize) n <- length(starts) seq1alnresidues <- 0 seq2alnresidues <- 0 for (i in 1:n) { chunkseq1aln <- substring(seq1aln, starts[i], starts[i]+chunksize-1) chunkseq2aln <- substring(seq2aln, starts[i], starts[i]+chunksize-1) # Find out how many gaps there are in chunkseq1aln: gaps1 <- countPattern("-",chunkseq1aln) # countPattern() is from Biostrings library # Find out how many gaps there have been on this line of subject3: gaps2 <- countPattern("-",chunkseq2aln) # countPattern() is from Biostrings library # Calculate how many residues of the first sequence we have printed so far in the alignment: seq1alnresidues <- seq1alnresidues + chunksize - gaps1 # Calculate how many residues of the second sequence we have printed so far in the alignment: seq2alnresidues <- seq2alnresidues + chunksize - gaps2 print(paste(chunkseq1aln,seq1alnresidues)) print(paste(chunkseq2aln,seq2alnresidues)) print(paste(' ')) } } You can then use the Biostrings library to create an alignment of two sequences, and print it out in blocks of length 'chunksize'. For example, here is an example, for printing out an alignment in blocks of length 30 residues: > s1 <- "MSHPALTQLRALRYCKEIPALDPQLLDWLLLEDSMTKRFGKTVSVTMIREGFVEQNE" > s2 <- "MSHPALTQLRALRYFDAIPALEPHLLDWLLLEDSVTKRFEQQGKRVSVTLIREAFVGQSE" > aln <- pairwiseAlignment(s1,s2,substitutionMatrix="BLOSUM50") > printPairwiseAlignment(aln,20) [1] "MSHPALTQLRALRYCKEIPALDPQLLDWLL 30" [1] "MSHPALTQLRALRYFDAIPALEPHLLDWLL 30" [1] " " [1] "LEDSMTKRF---GKTVSVTMIREGFVEQNE 57" [1] "LEDSVTKRFEQQGKRVSVTLIREAFVGQSE 60" [1] " " Does this seem a good idea to you all? I am relatively new to R so I am sure that somebody else could improve my function to make it nicer. If you think it's a good idea to make such a function, please could somebody add it to the Biostrings library (I don't know how to do this)? Regards, Avril Avril Coghlan University College Cork Ireland
Alignment Biostrings Alignment Biostrings • 1.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6