7 months ago by
Teeps0
Teeps0 wrote:

Hi everyone. Simple question but I'm new to bioconductor. I have a DNAString (myDNAStringSet) and a GRanges object made from a BED file (myGRangesObject) so I can find the sequences of the exons in myDNAStringSet using: getSeq(myDNAStringSet, myGRangesObject).

How could I pull out user defined sections between the exons? For example, if I wanted to grab the 1000 base pairs upstream of the first exon listed in myGRangesObject, or the 200 base pairs downstream of the 5th exon listed in myGRangesObject, how would I write that? Thank you for the help!

7 months ago by
Hervé Pagès ♦♦ 14k
United States
Hervé Pagès ♦♦ 14k wrote:

Hi,

Are you sure myGRangesObject is a GRanges object and not a GRangesList object?

Anyway you first need to come up with a GRanges (or GRangesList) object containing the genomic ranges of the sequences you want to pull out. For example, to grab the 1000 base pairs upstream of the ranges in myGRangesObject, first obtain the ranges of the upstream regions with upstream_regions <- promoters(myGRangesObject, upstream=1000, downstream=0), then use this instead of myGRangesObject in your call to getSeq(). Assuming myGRangesObject is a GRanges object and not a GRangesList object, if you only want to do this for the first exon listed in myGRangesObject, you can subset this GRanges object (with myGRangesObject[1]) before passing it to promoters().

See ?GenomicRanges::promoters for more information about promoters() and other intra range transformations like shift(), flank()resize(), etc...

Hope this helps,

H.