Text display of entire `DNAMultipleAlignment` objects
0
0
Entering edit mode
@charles-plessy-7857
Last seen 6 months ago
Japan

I have a DNAMultipleAlignment object. In R, the show function displays the head and tail of the aligned sequences, with nice colors. And I can cut-paste it as plain text as show below.

> seqs_aln
DNAMultipleAlignment with 5 rows and 230 columns
     aln                                                                                                   names               
[1] -----ATTGTTTAGTCTGGGGTTTACAGAAGAGTTGTTTCTTCATGGAA...TTTCGCCGAAAAATCAAG------------------GTTCCATTGAGT- Oki
[2] -----ATTGTTTAGTCTGCTG-TCACAGTAGAGTTGTTGCTTCATGGAA...TTTCGCCGAATAATCACTCAATTGAAAGACG-----GTTCCATTGAGT- Osa
[3] -----ATTGTTTAGTCTGCTG-TCACAGTAGAGTTGTTGCTTCATGGAA...ATTCGCCGAATAATCACTCAATTGAAAGACGCGATTGTTCCATTGAGT- Aom
[4] ---TTATTGTTTAGTCTGCAG-TCACAGTAGAGTTGTTGCTTCATGGAA...TTTCGCCGAATAATCACTGAATCGC-----------GTTCCATCGAATG Bar
[5] AGATTATTGTTTAGTCTGCAG-TCACAGTAGAGTTGTTGCTTCATGGAA...TTTCGCCGAATAATCACTGAATCGG-----------GTTCCATCGAATG Nor

This is neat but I also would like to output the full alignment in my _RMarkdown_ notebook. I wonder why there is no equivalent of writePairwiseAlignment() for mutliple alignment sequences?

After inspecting selectMethod("detail", "DNAMultipleAlignment"), I found that I can use a slightly customised version of Biostrings:::.write.MultAlign to display

> #Almost identical to from Biostrings:::.write.MultAlign, except it does not write to a file
> showMultiAlign <- function (x, invertColMask = F, showRowNames = T, hideMaskedCols = T) 
+ {
+         msk <- colmask(x)
+         dims <- dim(x)
+         if (invertColMask == FALSE) {
+             msk <- gaps(msk, start = 1, end = dims[2])
+         }
+         if (hideMaskedCols) {
+             hasMask <- FALSE
+         }
+         else {
+             colmask(x) <- NULL
+             if (length(msk) > 0) {
+                 hasMask <- TRUE
+             }
+             else {
+                 hasMask <- FALSE
+             }
+         }
+         if (hasMask) {
+             dims[1] <- dims[1] + 1
+         }
+         ch <- as.character(x)
+         ch <- unlist(lapply(ch, Biostrings:::.insertSpaces))
+         if (hasMask) {
+             mskInd <- as.integer(msk)
+             mskCh <- paste(as.character(replace(rep(1, dim(x)[2]), 
+                 mskInd, 0)), collapse = "")
+             mskCh <- Biostrings:::.insertSpaces(mskCh)
+         }
+         names <- names(ch)
+         ch <- sapply(ch, Biostrings:::.strChop, chopsize = 55, simplify = FALSE)
+         if (hasMask) {
+             mskCh <- .strChop(mskCh, chopsize = 55)
+             ch <- c(list(Mask = mskCh), ch)
+         }
+         maxLen <- max(nchar(names(ch)))
+         stockSpc <- paste(rep(" ", maxLen), collapse = "")
+         bufferSpacing <- function(name) {
+             spc <- paste(rep(" ", maxLen - nchar(name)), collapse = "")
+             paste(name, spc, sep = "")
+         }
+         ch <- c(ch, list(rep("", length(ch[[1]]))))
+         output <- character(length(ch[[1]]) * length(ch))
+         for (i in seq_len(length(ch[[1]]))) {
+             for (j in seq_len(length(ch))) {
+                 if (i == 1) {
+                   output[j] <- paste(unlist(lapply(names(ch[j]), 
+                     bufferSpacing)), "   ", ch[[j]][i], sep = "")
+                 }
+                 else {
+                   if (showRowNames) {
+                     output[(length(ch) * (i - 1)) + j] <- paste(unlist(lapply(names(ch[j]), 
+                       bufferSpacing)), "   ", ch[[j]][i], sep = "")
+                   }
+                   else {
+                     output[(length(ch) * (i - 1)) + j] <- paste(stockSpc, 
+                       "   ", ch[[j]][i], sep = "")
+                   }
+                 }
+             }
+         }
+         output <- gsub("\\s+$", "", output)
+         output <- output[1:length(output) - 1]
+         if (hasMask) {
+             output <- c(paste("", paste(c(dims, ""), collapse = " "), 
+                 collapse = " "), output)
+         }
+         else {
+             output <- c(paste("", paste(dims, collapse = " "), 
+                 collapse = " "), output)
+         }
+         output
+ }
> seqs_aln |> showMultiAlign()
 [1] " 5 230"                                                      
 [2] "Oki   -----ATTGT TTAGTCTGGG GTTTACAGAA GAGTTGTTTC TTCATGGAAC"
 [3] "Osa   -----ATTGT TTAGTCTGCT G-TCACAGTA GAGTTGTTGC TTCATGGAAC"
 [4] "Aom   -----ATTGT TTAGTCTGCT G-TCACAGTA GAGTTGTTGC TTCATGGAAC"
 [5] "Bar   ---TTATTGT TTAGTCTGCA G-TCACAGTA GAGTTGTTGC TTCATGGAAC"
 [6] "Nor   AGATTATTGT TTAGTCTGCA G-TCACAGTA GAGTTGTTGC TTCATGGAAC"
 [7] ""                                                            
 [8] "Oki   CGGGCTGCTG TAAATCAGAG GACTCGGTCT CGGCCGAGCG CGCGCTGACC"
 [9] "Osa   CAGCCCGCAG TAAATCAGAA GAGTCAGTCG CGGACGAGTG CGCGCTGACC"
[10] "Aom   CAGCCCGCAG TAAATCAGAA GAGTCAGTCG CGGACGAGTG CGCGCTGACC"
[11] "Bar   CAGCCCGCAG TAAATCAGAA GACTAGGTCG CGGGCGACTG CGTGCTGACC"
[12] "Nor   CAGCCCGCAG TAAATCAGAA GACTCGGTCG CGGGCGACTG CGTGCTGACC"
[13] ""                                                            
[14] "Oki   TCTAAAGCGG CGGCCCTAGC GCGCATCGTC AAGCGCAAAT TTTTCAGAAT"
[15] "Osa   TCTAAAGCGG CGGCCCCAAC GCGCATCGTC AAGCGCAACA TTTTCAGAAT"
[16] "Aom   TCTAAAGCGG CGGCCCCAAC GCGCATCGTC AAGCGCAACA TTTTCAGAAT"
[17] "Bar   TCTAAAGCGG CGGCCTCAGC GCGCACCGTC GAGCGCAACA TTTTCAGAAT"
[18] "Nor   TCTAAAGCGG CGGCCTCAGC GCGCACCGTC AAGCGCAACA TTTTCAGAAT"
[19] ""                                                            
[20] "Oki   TGAGCTTCCA AAATCAAACA CCAGAAACGC CTTTCGCCGA AAAATCAAG-"
[21] "Osa   TGAGCTTCTG AAATCAAAAA GCAGAAACA- ATTTCGCCGA ATAATCACTC"
[22] "Aom   TGAGCTTCTG AAATCAAAAA GCAGAAACA- AATTCGCCGA ATAATCACTC"
[23] "Bar   TGAGCTTCTG AAATCAAAAA GCAGAAACA- TTTTCGCCGA ATAATCACTG"
[24] "Nor   TGAGCTTCTG AAATCAAAAA GCAGAAACA- TTTTCGCCGA ATAATCACTG"
[25] ""                                                            
[26] "Oki   ---------- -------GTT CCATTGAGT-"                      
[27] "Osa   AATTGAAAGA CG-----GTT CCATTGAGT-"                      
[28] "Aom   AATTGAAAGA CGCGATTGTT CCATTGAGT-"                      
[29] "Bar   AATCGC---- -------GTT CCATCGAATG"                      
[30] "Nor   AATCGG---- -------GTT CCATCGAATG"

My questions are:

  • Is there an easy way to get it in color ?
  • Can future versions of _Biostrings_ provide a writeMultipleAlignment function ?
Biostrings • 464 views
ADD COMMENT

Login before adding your answer.

Traffic: 946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6