Question: the coordinates of exons of RefSeq or ENSEMBL genes
0
gravatar for Bogdan
11 weeks ago by
Bogdan580
Palo Alto, CA, USA
Bogdan580 wrote:

Dear all,

please may i ask a simple question :

in BioC, what is the simplest (or the most direct way) to retrieve the COORDINATES of EXONS for RefSeq or ENSEMBL genes ?

thanks a lot,

-- bogdan

txdb exons genome • 129 views
ADD COMMENTlink modified 11 weeks ago by Martin Morgan ♦♦ 23k • written 11 weeks ago by Bogdan580
Answer: the coordinates of exons of RefSeq or ENSEMBL genes
3
gravatar for Wei Shi
11 weeks ago by
Wei Shi3.2k
Australia
Wei Shi3.2k wrote:

You can certainly use Bioc core packages to do so, but the getInBuiltAnnotation function in Rsubread offers you a simple way to get this information for human and mouse RefSeq genes.

ADD COMMENTlink written 11 weeks ago by Wei Shi3.2k

thanks a lot, Wei ! looking at the example that you did provide, it works great :

 x <- getInBuiltAnnotation("hg38")
 x[1:5,]
ADD REPLYlink written 11 weeks ago by Bogdan580
Answer: the coordinates of exons of RefSeq or ENSEMBL genes
3
gravatar for Martin Morgan
11 weeks ago by
Martin Morgan ♦♦ 23k
United States
Martin Morgan ♦♦ 23k wrote:

For ensembl, load the ensembldb and AnnotationHub and query for EnsDb objects for Homo sapiens, release 97

> library(ensembldb)
> library(AnnotationHub)
> hub = AnnotationHub()
snapshotDate(): 2019-07-10
> query(hub, c("EnsDb", "Homo sapiens", "97"))

This returns a single record with id AH73881. Retrieve and use exons() to extract the information

> exons(hub[["AH73881"]])
downloading 0 resources
loading from cache
GRanges object with 828532 ranges and 1 metadata column:
                  seqnames            ranges strand |         exon_id
                     <Rle>         <IRanges>  <Rle> |     <character>
  ENSE00002234944        1       11869-12227      + | ENSE00002234944
  ENSE00001948541        1       12010-12057      + | ENSE00001948541
  ENSE00001671638        1       12179-12227      + | ENSE00001671638
  ENSE00003582793        1       12613-12721      + | ENSE00003582793
  ENSE00001758273        1       12613-12697      + | ENSE00001758273
              ...      ...               ...    ... .             ...
  ENSE00001741452        Y 26628271-26628437      - | ENSE00001741452
  ENSE00001681574        Y 26630647-26630749      - | ENSE00001681574
  ENSE00001638296        Y 26633345-26633431      - | ENSE00001638296
  ENSE00001797328        Y 26634523-26634652      - | ENSE00001797328
  ENSE00001794473        Y 56855244-56855488      + | ENSE00001794473
  -------
  seqinfo: 424 sequences from GRCh38 genome

Use identical steps for refSeq (knownGene) annotations, using

query(hub, c("TxDb", "Homo sapiens"))
ADD COMMENTlink modified 11 weeks ago • written 11 weeks ago by Martin Morgan ♦♦ 23k

thank you, Martin ! have a happy weekend !

ADD REPLYlink written 10 weeks ago by Bogdan580
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 343 users visited in the last hour