How to get sequences corresponding to a GRanges?
1
2
Entering edit mode
@cei-abreu-goodger-4433
Last seen 9.1 years ago
Mexico
Hello all, I was wondering if there was a simple way to get the sequences corresponding to the ranges stored in a GRanges object. If you have the original sequences in a BSgenome object, you can use 'getSeq'. But what if you just have the fasta file, imported as a DNAStringSet object? I want to avoid having to forge a new BSgenome object each time, since I'm dealing with unfinished assemblies, with thousands of sequences that I don't want to split into individual fasta files, etc. Many thanks, Cei -- Dr. Cei Abreu-Goodger Profesor Investigador Langebio CINVESTAV Tel: (52) 462 166 3006 cei at langebio.cinvestav.mx -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
BSgenome BSgenome BSgenome BSgenome • 3.0k views
ADD COMMENT
2
Entering edit mode
@martin-morgan-1513
Last seen 2 days ago
United States
On 10/28/2012 11:28 PM, Cei Abreu-Goodger wrote: > Hello all, > > I was wondering if there was a simple way to get the sequences corresponding to > the ranges stored in a GRanges object. If you have the original sequences in a > BSgenome object, you can use 'getSeq'. But what if you just have the fasta file, > imported as a DNAStringSet object? > > I want to avoid having to forge a new BSgenome object each time, since I'm > dealing with unfinished assemblies, with thousands of sequences that I don't > want to split into individual fasta files, etc. Rsamtools has FaFile and FaFileList to represent (indexed, via indexFa) fasta files, and a getSeq method that takes an FaFile and a GRanges (or similar) object. This is built on top of scanFa. See library(Rsamtools) method?"getSeq,FaFile" Martin > > Many thanks, > > Cei > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD COMMENT
0
Entering edit mode
Thanks Martin, that's exactly what I was looking for! (missed the showMethods("getSeq"), sorry) On 10/28/12 5:49 PM, Martin Morgan wrote: > On 10/28/2012 11:28 PM, Cei Abreu-Goodger wrote: >> Hello all, >> >> I was wondering if there was a simple way to get the sequences >> corresponding to >> the ranges stored in a GRanges object. If you have the original >> sequences in a >> BSgenome object, you can use 'getSeq'. But what if you just have the >> fasta file, >> imported as a DNAStringSet object? >> >> I want to avoid having to forge a new BSgenome object each time, since >> I'm >> dealing with unfinished assemblies, with thousands of sequences that I >> don't >> want to split into individual fasta files, etc. > > Rsamtools has FaFile and FaFileList to represent (indexed, via indexFa) > fasta files, and a getSeq method that takes an FaFile and a GRanges (or > similar) object. This is built on top of scanFa. See > > library(Rsamtools) > method?"getSeq,FaFile" > > Martin > >> >> Many thanks, >> >> Cei >> > > -- Dr. Cei Abreu-Goodger Profesor Investigador Langebio CINVESTAV Tel: (52) 462 166 3006 cei at langebio.cinvestav.mx -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
ADD REPLY

Login before adding your answer.

Traffic: 551 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6