Search
Question: Automated blasting of short nucleotide sequences against each other
0
gravatar for Ken Termiso
12.8 years ago by
Ken Termiso250
Ken Termiso250 wrote:
Hi all, This may be slightly off-topic, but I'd like to be able to BLAST a large set of about 500 nucleotide sequences against itself (i.e. sequence #1 gets blasted against the other 499 sequences and so on, for a total of 500x500 or 250,000 blasts), and one thing I unbelievably cannot google on the net is a script to do it...rather than writing one I was hoping that someone could point me to a link for this...I found tons of scripts for doing it against a database, but nothing with a matrix like I need to BLAST... My sequences are in plain text. I've got the standalone blast, but just need a script... Presumably this would be very useful for analyzing pseudo-homologous probe sequences..?..so maybe it isn't completely off-topic... Thanks in advance, Ken
ADD COMMENTlink modified 12.8 years ago by David Lapointe170 • written 12.8 years ago by Ken Termiso250
0
gravatar for Sean Davis
12.8 years ago by
Sean Davis21k
United States
Sean Davis21k wrote:
Ken, Actually, if you think about how blast (or blat or other alignment programs) works, you just need to blast the fasta against the blast database of the same sequences. You will get output from blast that includes each sequence blasted against all others, with the obvious caveaut that not all sequences are going to align, so there will be some missing comparisons--no way around that. Then, you just need to put them into some useful form--consider using bioperl if you have access to it. Better yet, if these are sequences from the same organism, just use blat and the output is tab-delimited text which you can load directly into R. If you use blat, you can just do: blat db.fasta db.fasta outfile.psl This should take just a few seconds on a modern machine, depending on the length of the sequences. Sean On Feb 18, 2005, at 2:41 PM, Ken Termiso wrote: > Hi all, > > This may be slightly off-topic, but I'd like to be able to BLAST a > large set of about 500 nucleotide sequences against itself (i.e. > sequence #1 gets blasted against the other 499 sequences and so on, > for a total of 500x500 or 250,000 blasts), and one thing I > unbelievably cannot google on the net is a script to do it...rather > than writing one I was hoping that someone could point me to a link > for this...I found tons of scripts for doing it against a database, > but nothing with a matrix like I need to BLAST... > > My sequences are in plain text. I've got the standalone blast, but > just need a script... > > Presumably this would be very useful for analyzing pseudo-homologous > probe sequences..?..so maybe it isn't completely off-topic... > > Thanks in advance, > Ken > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENTlink written 12.8 years ago by Sean Davis21k
0
gravatar for David Lapointe
12.8 years ago by
David Lapointe170 wrote:
Well, you only need one run. A query of 500 short seqs against a database of the 500 short seqs. See http://www.oreillynet.com/pub/a/oreilly/bio/news/BLAST.html for a great example. ( Blasting Y.Pestis Proteome vs E.Coli proteome ~ 4000 seqs each, takes about 3-4 min on my laptop). BLASTCLUST might also do what you need. David On Friday 18 February 2005 02:41 pm, Ken Termiso wrote: > Hi all, > > This may be slightly off-topic, but I'd like to be able to BLAST a large > set of about 500 nucleotide sequences against itself (i.e. sequence #1 gets > blasted against the other 499 sequences and so on, for a total of 500x500 > or 250,000 blasts), and one thing I unbelievably cannot google on the net > is a script to do it...rather than writing one I was hoping that someone > could point me to a link for this...I found tons of scripts for doing it > against a database, but nothing with a matrix like I need to BLAST... > > My sequences are in plain text. I've got the standalone blast, but just > need a script... > > Presumably this would be very useful for analyzing pseudo- homologous probe > sequences..?..so maybe it isn't completely off-topic... > > Thanks in advance, > Ken > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENTlink written 12.8 years ago by David Lapointe170
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 172 users visited in the last hour