Question: msa: How to obtain a subset of an alignment
gravatar for Christof Winter
13 months ago by
TU München
Christof Winter10 wrote:

I am using the msa package to align DNA sequences with Muscle (which works great!). Now I was wondering whether it's possible to extract a subset of an alignment. In the following example, I would like to extract just the first 3 rows from the alignment:


mySequenceFile <- system.file("examples", "exampleAA.fasta", package="msa")
mySequences <- readAAStringSet(mySequenceFile)

aln <- msa(mySequences)

# subset, get first 3 only
rowmask(aln, invert=TRUE) <- IRanges(start=1, end=3)

print(aln, show="complete")

However, the masked rows are still present and are showing up with # characters. How can I drop the masked parts in order to have just the first 3 rows in an alignment object? 

ADD COMMENTlink modified 13 months ago by UBodenhofer250 • written 13 months ago by Christof Winter10
gravatar for UBodenhofer
13 months ago by
Johannes Kepler University, Linz, Austria
UBodenhofer250 wrote:

Thanks for your positive feedback, Christof! 

Regarding your question: yes, it is true that objects of class 'MultipleAlignment' and classes derived from 'MultipleAlignment' do not support subsetting. Presently, I can offer the following workaround (... continuing your example code):

alnSubset <- as(AAMultipleAlignment(unmasked(aln)[1:3]),

print(alnSubset, show="complete")

I admit that this is not very elegant. Moreover, all metadata describing the alignment is lost. I am actually considering adding some more casts to the package or maybe even subsetting methods. Maybe somebody else has some thoughts on this subject?

ADD COMMENTlink written 13 months ago by UBodenhofer250
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 159 users visited in the last hour