Re: Sequence alignment for DNA sequences (John Zhang)

0

Entering edit mode

Aedin Culhane ▴ 310

@aedin-culhane-500

Last seen 9.6 years ago

Hi Might also be useful to look at some of the functions in seqinR. Sorry about this, but can I ask an off topic question? Many nucleotide alignment algorithms are available. Are we replicating the effort of programs like Bioperl/EMBOSS etc etc in R? What are people's experience with RSperl and other "connectivity" modules? Regards, Aedin

• 1.2k views

ADD COMMENT • link updated 18.9 years ago by Fangxin Hong ▴ 810 • written 19.0 years ago by Aedin Culhane ▴ 310

0

Entering edit mode

A.J. Rossini ▴ 210

@aj-rossini-973

Last seen 9.6 years ago

Sure, and MEGA is a wonderful alignment tool, though last time I used it, 3+ years ago, it wasn't scriptable. I'd prefer reuse rather than reinvention, but there is a good deal to be gained from leveraging some of the statistically oriented tools (and overall systems biology integration that is happening in R). With respect to RSperl, we'd have to write data converters. Might be possible, but the right approach would be to start with the BioStrings package and flesh it out for activities. A fun activity, but unfortunately far removed from my current day job. best, -tony On 5/18/05, Aedin <aedin.culhane@ucd.ie> wrote: > Hi > Might also be useful to look at some of the functions in seqinR. > > Sorry about this, but can I ask an off topic question? Many nucleotide > alignment algorithms are available. Are we replicating the effort of > programs like Bioperl/EMBOSS etc etc in R? What are people's experience > with RSperl and other "connectivity" modules? > > Regards, > Aedin > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > -- best, -tony "Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes" (AJR, 4Jan05). A.J. Rossini blindglobe@gmail.com

ADD COMMENT • link 19.0 years ago A.J. Rossini ▴ 210

0

Entering edit mode

rgentleman ★ 5.5k

@rgentleman-7725

Last seen 9.0 years ago

United States

Hi, It is in general a bad idea to rewrite something in a different language, unless there are good reasons. The interoperability of R, and other languages, like Perl and Python makes it possible to often reuse rather than reinvent. That said, the complexity involved is not trivial and many find it hard to manage keeping the systems in snyc. I find that if I want to do anything more - in terms of analysis or visualization then I am going to need to get the data over to R, so it tends to be involved in any event. We wrote the Biostrings package in part to better understand the algorithms from Gusfield's book and in part because for a digital karyotyping project that we were working on we found that the available perl solutions were too slow. For some of the pattern matching Biostrings was much faster - your mileage may vary. It too is not complete - but there is some very nice code there. And there is nice code in other places - reuse beats reinvention. Robert ps the folks that wrote seqinR seem to be confused about the name though - from CRAN you need to download seqinr, also note it does not seem to handle the extended DNA alphabet - which is something I find I need if I'm doing very much sequence work On May 18, 2005, at 4:32 AM, Aedin wrote: > Hi > Might also be useful to look at some of the functions in seqinR. > > Sorry about this, but can I ask an off topic question? Many nucleotide > alignment algorithms are available. Are we replicating the effort of > programs like Bioperl/EMBOSS etc etc in R? What are people's > experience > with RSperl and other "connectivity" modules? > > Regards, > Aedin > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > +--------------------------------------------------------------------- -- ----------------+ | Robert Gentleman phone: (206) 667-7700 | | Head, Program in Computational Biology fax: (206) 667-1319 | | Division of Public Health Sciences office: M2-B865 | | Fred Hutchinson Cancer Research Center | | email: rgentlem@fhcrc.org | +--------------------------------------------------------------------- -- ----------------+

ADD COMMENT • link 19.0 years ago rgentleman ★ 5.5k

0

Entering edit mode

Fangxin Hong ▴ 810

@fangxin-hong-912

Last seen 9.6 years ago

Hi Aedin; Thank you for your reply. > Might also be useful to look at some of the functions in seqinR. > > Sorry about this, but can I ask an off topic question? Many nucleotide > alignment algorithms are available. Are we replicating the effort of > programs like Bioperl/EMBOSS etc etc in R? What are people's experience > with RSperl and other "connectivity" modules? There are some algorithm which was originally implemented in other environment, but has been re-written in R later due to the popularity of the usage of R. In stead of spending time learning (there is still a decent percentage of R users that don't know Perl well), I think maybe most user prefer to have it written in R, even an less perfect version. One can always go to those better-written algorithm later if there is a need. Well, this is just my personal opinion. Fangxin > Regards, > Aedin > > > > > > > -------------------- Fangxin Hong Ph.D. Plant Biology Laboratory The Salk Institute 10010 N. Torrey Pines Rd. La Jolla, CA 92037 E-mail: fhong@salk.edu (Phone): 858-453-4100 ext 1105

ADD COMMENT • link 18.9 years ago Fangxin Hong ▴ 810

Login before adding your answer.