Entering edit mode
I have a DNAMultipleAlignment
object. In R
, the show
function displays the head and tail of the aligned sequences, with nice colors. And I can cut-paste it as plain text as show below.
> seqs_aln
DNAMultipleAlignment with 5 rows and 230 columns
aln names
[1] -----ATTGTTTAGTCTGGGGTTTACAGAAGAGTTGTTTCTTCATGGAA...TTTCGCCGAAAAATCAAG------------------GTTCCATTGAGT- Oki
[2] -----ATTGTTTAGTCTGCTG-TCACAGTAGAGTTGTTGCTTCATGGAA...TTTCGCCGAATAATCACTCAATTGAAAGACG-----GTTCCATTGAGT- Osa
[3] -----ATTGTTTAGTCTGCTG-TCACAGTAGAGTTGTTGCTTCATGGAA...ATTCGCCGAATAATCACTCAATTGAAAGACGCGATTGTTCCATTGAGT- Aom
[4] ---TTATTGTTTAGTCTGCAG-TCACAGTAGAGTTGTTGCTTCATGGAA...TTTCGCCGAATAATCACTGAATCGC-----------GTTCCATCGAATG Bar
[5] AGATTATTGTTTAGTCTGCAG-TCACAGTAGAGTTGTTGCTTCATGGAA...TTTCGCCGAATAATCACTGAATCGG-----------GTTCCATCGAATG Nor
This is neat but I also would like to output the full alignment in my _RMarkdown_ notebook. I wonder why there is no equivalent of writePairwiseAlignment()
for mutliple alignment sequences?
After inspecting selectMethod("detail", "DNAMultipleAlignment")
, I found that I can use a slightly customised version of Biostrings:::.write.MultAlign
to display
> #Almost identical to from Biostrings:::.write.MultAlign, except it does not write to a file
> showMultiAlign <- function (x, invertColMask = F, showRowNames = T, hideMaskedCols = T)
+ {
+ msk <- colmask(x)
+ dims <- dim(x)
+ if (invertColMask == FALSE) {
+ msk <- gaps(msk, start = 1, end = dims[2])
+ }
+ if (hideMaskedCols) {
+ hasMask <- FALSE
+ }
+ else {
+ colmask(x) <- NULL
+ if (length(msk) > 0) {
+ hasMask <- TRUE
+ }
+ else {
+ hasMask <- FALSE
+ }
+ }
+ if (hasMask) {
+ dims[1] <- dims[1] + 1
+ }
+ ch <- as.character(x)
+ ch <- unlist(lapply(ch, Biostrings:::.insertSpaces))
+ if (hasMask) {
+ mskInd <- as.integer(msk)
+ mskCh <- paste(as.character(replace(rep(1, dim(x)[2]),
+ mskInd, 0)), collapse = "")
+ mskCh <- Biostrings:::.insertSpaces(mskCh)
+ }
+ names <- names(ch)
+ ch <- sapply(ch, Biostrings:::.strChop, chopsize = 55, simplify = FALSE)
+ if (hasMask) {
+ mskCh <- .strChop(mskCh, chopsize = 55)
+ ch <- c(list(Mask = mskCh), ch)
+ }
+ maxLen <- max(nchar(names(ch)))
+ stockSpc <- paste(rep(" ", maxLen), collapse = "")
+ bufferSpacing <- function(name) {
+ spc <- paste(rep(" ", maxLen - nchar(name)), collapse = "")
+ paste(name, spc, sep = "")
+ }
+ ch <- c(ch, list(rep("", length(ch[[1]]))))
+ output <- character(length(ch[[1]]) * length(ch))
+ for (i in seq_len(length(ch[[1]]))) {
+ for (j in seq_len(length(ch))) {
+ if (i == 1) {
+ output[j] <- paste(unlist(lapply(names(ch[j]),
+ bufferSpacing)), " ", ch[[j]][i], sep = "")
+ }
+ else {
+ if (showRowNames) {
+ output[(length(ch) * (i - 1)) + j] <- paste(unlist(lapply(names(ch[j]),
+ bufferSpacing)), " ", ch[[j]][i], sep = "")
+ }
+ else {
+ output[(length(ch) * (i - 1)) + j] <- paste(stockSpc,
+ " ", ch[[j]][i], sep = "")
+ }
+ }
+ }
+ }
+ output <- gsub("\\s+$", "", output)
+ output <- output[1:length(output) - 1]
+ if (hasMask) {
+ output <- c(paste("", paste(c(dims, ""), collapse = " "),
+ collapse = " "), output)
+ }
+ else {
+ output <- c(paste("", paste(dims, collapse = " "),
+ collapse = " "), output)
+ }
+ output
+ }
> seqs_aln |> showMultiAlign()
[1] " 5 230"
[2] "Oki -----ATTGT TTAGTCTGGG GTTTACAGAA GAGTTGTTTC TTCATGGAAC"
[3] "Osa -----ATTGT TTAGTCTGCT G-TCACAGTA GAGTTGTTGC TTCATGGAAC"
[4] "Aom -----ATTGT TTAGTCTGCT G-TCACAGTA GAGTTGTTGC TTCATGGAAC"
[5] "Bar ---TTATTGT TTAGTCTGCA G-TCACAGTA GAGTTGTTGC TTCATGGAAC"
[6] "Nor AGATTATTGT TTAGTCTGCA G-TCACAGTA GAGTTGTTGC TTCATGGAAC"
[7] ""
[8] "Oki CGGGCTGCTG TAAATCAGAG GACTCGGTCT CGGCCGAGCG CGCGCTGACC"
[9] "Osa CAGCCCGCAG TAAATCAGAA GAGTCAGTCG CGGACGAGTG CGCGCTGACC"
[10] "Aom CAGCCCGCAG TAAATCAGAA GAGTCAGTCG CGGACGAGTG CGCGCTGACC"
[11] "Bar CAGCCCGCAG TAAATCAGAA GACTAGGTCG CGGGCGACTG CGTGCTGACC"
[12] "Nor CAGCCCGCAG TAAATCAGAA GACTCGGTCG CGGGCGACTG CGTGCTGACC"
[13] ""
[14] "Oki TCTAAAGCGG CGGCCCTAGC GCGCATCGTC AAGCGCAAAT TTTTCAGAAT"
[15] "Osa TCTAAAGCGG CGGCCCCAAC GCGCATCGTC AAGCGCAACA TTTTCAGAAT"
[16] "Aom TCTAAAGCGG CGGCCCCAAC GCGCATCGTC AAGCGCAACA TTTTCAGAAT"
[17] "Bar TCTAAAGCGG CGGCCTCAGC GCGCACCGTC GAGCGCAACA TTTTCAGAAT"
[18] "Nor TCTAAAGCGG CGGCCTCAGC GCGCACCGTC AAGCGCAACA TTTTCAGAAT"
[19] ""
[20] "Oki TGAGCTTCCA AAATCAAACA CCAGAAACGC CTTTCGCCGA AAAATCAAG-"
[21] "Osa TGAGCTTCTG AAATCAAAAA GCAGAAACA- ATTTCGCCGA ATAATCACTC"
[22] "Aom TGAGCTTCTG AAATCAAAAA GCAGAAACA- AATTCGCCGA ATAATCACTC"
[23] "Bar TGAGCTTCTG AAATCAAAAA GCAGAAACA- TTTTCGCCGA ATAATCACTG"
[24] "Nor TGAGCTTCTG AAATCAAAAA GCAGAAACA- TTTTCGCCGA ATAATCACTG"
[25] ""
[26] "Oki ---------- -------GTT CCATTGAGT-"
[27] "Osa AATTGAAAGA CG-----GTT CCATTGAGT-"
[28] "Aom AATTGAAAGA CGCGATTGTT CCATTGAGT-"
[29] "Bar AATCGC---- -------GTT CCATCGAATG"
[30] "Nor AATCGG---- -------GTT CCATCGAATG"
My questions are:
- Is there an easy way to get it in color ?
- Can future versions of _Biostrings_ provide a
writeMultipleAlignment
function ?