Search
Question: How do I extract the complete MsaAAMultipleAlignment object to a file
0
gravatar for Assa Yeroslaviz
9 months ago by
Assa Yeroslaviz1.4k
Munich, Germany
Assa Yeroslaviz1.4k wrote:

Hi all,

I was wondering how I can export the results of my `msa` run into a file.

I know I can export it to a fasta file using `unmasked(msa)`, but I would like to export the complete results I see in the terminal to a file or if possible in clustal format. 

this is my script:

P <- msa(seqs, order = "input", "ClustalOmega")
print(P, show = "complete")
writeXStringSet(unmasked(P), file="test.clustal")

But this gives ma a fasta file. Is there a way to get the clustal format as an  output?

 

thanks

 

 

ADD COMMENTlink modified 9 months ago by UBodenhofer250 • written 9 months ago by Assa Yeroslaviz1.4k
1
gravatar for UBodenhofer
9 months ago by
UBodenhofer250
Johannes Kepler University, Linz, Austria
UBodenhofer250 wrote:

Reply to question #1: The width is controlled by the global R option 'width' (in line with the print() methods in the 'Biostrings' package). So you can change the number of columns yourself. If you want to avoid this to have a global effect on your R session, you should save the current value and then, after running print(), revert to the previous value. Here is an example (the first lines are to be copied from above):

saveWidth <- getOption("width")
options(width=100)
sink("myAlignment.txt")
print(myAlignment, show="complete", halfNrow=-1)
sink()
options(width=saveWidth)

Reply to question #2: Sorry, no. This is an implementation that is, to quite some extent, copied from or at least inspired by the methods in the 'Biostrings' package. You can influence the width of names by the 'nameWidth' argument, but that's all. If you want to tailor the print() methods to your very specific needs, you will have to make a copy of the relevant parts of the source file msa/R/print-methods.R and customize the code of your local copy to your needs.

 

ADD COMMENTlink written 9 months ago by UBodenhofer250

thanks, this was very helpful. 

ADD REPLYlink written 9 months ago by Assa Yeroslaviz1.4k
0
gravatar for UBodenhofer
9 months ago by
UBodenhofer250
Johannes Kepler University, Linz, Austria
UBodenhofer250 wrote:

I am sorry, we currently do not have any functionality for saving clustal files in our 'msa' package. I quickly had a look whether there are other options. I came across the thread "R write alignment MSA" on Biostars. The second option sounds fantastic, but I checked it and it does not seem possible. So it seems what you are asking for does not exist. So you'd probably better resort to the option to save the multiple alignment to a FASTA file and to convert it to clustal format using an external tool (option #1 in the Biostars thread mentioned above). 

In any case, we are currently thinking of what extensions can be useful for the  'msa' package. The methods for saving clustal files are hidden in the multiple sequence alignment tools that are part of the 'msa' package, they are just not exposed to the end user. So it should not be a big deal to do this. We will not manage for this year's spring release of Bioconductor (3.7), but I am optimistic for 3.8.

ADD COMMENTlink modified 9 months ago • written 9 months ago by UBodenhofer250

Thanks for the answer. It would be great if you can make it happens for BioC 3.8. 

But if clustal is not possible, is there a way to extract the object `as is` ?

I have this results object: 

MsaAAMultipleAlignment with 154 rows and 1142 columns
      aln (1..79)                                                                     names
  [1] MLVLRCRLGTSFPKLDNLVPKGKMKILLVFLGLLGNSVAMPMHMPRMPGFSSKSEEMMRYNQFNFMNGPHMAHLGPFFG sp|Q9NRM1|ENAM_HU...
  [2] ------------------------------------------------------------------------------- bestMsms_id_68
  [3] ------------------------------------------------------------------------------- bestMsms_id_73
  [4] ------------------------------------------------------------------------------- bestMsms_id_90
   ... ...
[151] ------------------------------------------------------------------------------- dn_ms_scan_number...
[152] ------------------------------------------------------------------------------- dn_ms_scan_number...
[153] ------------------------------------------------------------------------------- dn_ms_scan_number...
[154] ------------------------------------------------------------------------------- dn_ms_scan_number...
  Con ------------------------------------------------------------------------------- Consensus 

      aln (80..158)                                                                   names
  [1] NGLPQQFPQYQMPMWPQPPPNTWHPRKSSAPKRHNKTDQTQETQKPNQTQSKKPPQKRPLKQPSHNQPQPEEEAQPPQA sp|Q9NRM1|ENAM_HU...
  [2] ------------------------------------------------------------------------------- bestMsms_id_68
  [3] ------------------------------------------------------------------------------- bestMsms_id_73
  [4] ------------------------------------------------------------------------------- bestMsms_id_90

...

And I would like to extract it into a file. Is it possible, without converting it into a fastA file?

thanks

 

ADD REPLYlink written 9 months ago by Assa Yeroslaviz1.4k

Well, what you can alsways do is to dump this output to a file using sink(). Here is a code snippet:

library(msa)

filepath <- system.file("examples", "exampleAA.fasta", package="msa")
mySeqs <- readAAStringSet(filepath)

myAlignment <- msaClustalW(mySeqs)

sink("myAlignment.txt")
print(myAlignment, show="complete")
sink()
ADD REPLYlink written 9 months ago by UBodenhofer250

thanks, but this is not exactly what I need

ADD REPLYlink modified 9 months ago • written 9 months ago by Assa Yeroslaviz1.4k

Sorry i was happy too fast. this is not exactly what I need. When I do this, I still get only the first nine row and the lst nine rows of the file with the ... between them.

I would like to get the complete file - in my case all 154 rows in the text file. Is this possible?

ADD REPLYlink written 9 months ago by Assa Yeroslaviz1.4k

Are you sure you have used the option 'show="complete"'? When I run my example code from above, I see everything.

ADD REPLYlink written 9 months ago by UBodenhofer250

Hi, Yes I am sure. I have solved it by adding the parameter halfNrow = 100

ADD REPLYlink modified 9 months ago • written 9 months ago by Assa Yeroslaviz1.4k
1

Excellent, I'm glad you solved it. You can also set 'halfNrow' to -1 or NA, then it prints everything.

ADD REPLYlink modified 9 months ago • written 9 months ago by UBodenhofer250

I have a two other questions

1. is it possible to set the numbers of letters in each row? 

For now it depends on the windows size (AFAIK), but I would like to set it to 60 (e.g). is it possible?

2. Is it possible to put the name of the organisms at the beginning of the row and not at the end?

thanks again

ADD REPLYlink written 9 months ago by Assa Yeroslaviz1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 180 users visited in the last hour