Question: msa: How to obtain a subset of an alignment
gravatar for Christof Winter
7 months ago by
TU München
Christof Winter10 wrote:

I am using the msa package to align DNA sequences with Muscle (which works great!). Now I was wondering whether it's possible to extract a subset of an alignment. In the following example, I would like to extract just the first 3 rows from the alignment:


mySequenceFile <- system.file("examples", "exampleAA.fasta", package="msa")
mySequences <- readAAStringSet(mySequenceFile)

aln <- msa(mySequences)

# subset, get first 3 only
rowmask(aln, invert=TRUE) <- IRanges(start=1, end=3)

print(aln, show="complete")

However, the masked rows are still present and are showing up with # characters. How can I drop the masked parts in order to have just the first 3 rows in an alignment object? 

ADD COMMENTlink modified 7 months ago by UBodenhofer250 • written 7 months ago by Christof Winter10
gravatar for UBodenhofer
7 months ago by
Johannes Kepler University, Linz, Austria
UBodenhofer250 wrote:

Thanks for your positive feedback, Christof! 

Regarding your question: yes, it is true that objects of class 'MultipleAlignment' and classes derived from 'MultipleAlignment' do not support subsetting. Presently, I can offer the following workaround (... continuing your example code):

alnSubset <- as(AAMultipleAlignment(unmasked(aln)[1:3]),

print(alnSubset, show="complete")

I admit that this is not very elegant. Moreover, all metadata describing the alignment is lost. I am actually considering adding some more casts to the package or maybe even subsetting methods. Maybe somebody else has some thoughts on this subject?

ADD COMMENTlink written 7 months ago by UBodenhofer250
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 312 users visited in the last hour