Question: Simple UCSC to transcript annotation
0
gravatar for AntonS
2.3 years ago by
AntonS0
AntonS0 wrote:

Hi,

 

I tried to find the solution on my own, but I think the whole tdx, annotationdb, ... are circuitous.

I have a chromosomal location (chr1:48902063-48902085, chr2:202340342-202340364, ....) and I just want to know, whether this region is an exon. Is there a simple command to do it?

 

Best regards and thank you in advanced

annotation ucsc transcripts tdx • 357 views
ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by AntonS0
Answer: Simple UCSC to transcript annotation
1
gravatar for James W. MacDonald
2.3 years ago by
United States
James W. MacDonald51k wrote:
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)

> ex <- exons(TxDb.Hsapiens.UCSC.hg19.knownGene)

> gr <- GRanges(c("chr1","chr2"), IRanges(c(48902063,202340342), c(48902085, 202340364)))

> subsetByOverlaps(ex, gr)
GRanges object with 1 range and 1 metadata column:
      seqnames                 ranges strand |   exon_id
         <Rle>              <IRanges>  <Rle> | <integer>
  [1]     chr2 [202340342, 202340465]      + |     34935
  -------
  seqinfo: 93 sequences (1 circular) from hg19 genome
>

So the answers are no, and part of an exon.

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by James W. MacDonald51k
Answer: Simple UCSC to transcript annotation
0
gravatar for AntonS
2.3 years ago by
AntonS0
AntonS0 wrote:

Thank you, but this object is again nested in at least 3 levels and horrible to handle. Never had a such hard to handle package.

Now I just need a simple data.frame

Chr     ExonStart     ExonEnd     ExonNr     Gene

ADD COMMENTlink written 2.3 years ago by AntonS0

If you want to add a comment, please use the ADD COMMENT link. The Add your answer box below is for people to add answers, not more questions or comments.

I have no idea what you mean by 'nested in at least 3 levels'. Are you complaining that it took three lines of code to generate an answer?

If so, there are always tradeoffs to be made - you can make things really simple and straightforward, but the cost to that is you force people to do what you think they should do, and make it difficult to do other things that they might want to do. The alternative is to make things very powerful, but the cost to that is complexity. All of the Bioconductor infrastructure for dealing with genomic position data is very powerful, but at the same time very complex. What people gain from the complexity is the ability to do lots of things that a simpler API would likely prevent.

There is extensive documentation for all of the objects that I have generated, so I would point you to the help pages, as well as the vignettes for the GenomicRanges package. As a hint towards what you want, do note that

> ex <- exonsBy(TxDb.Hsapiens.UCSC.hg19.knownGene, "gene")
> unlist(ex)
GRanges object with 272776 ranges and 2 metadata columns:
       seqnames               ranges strand |   exon_id   exon_name
          <Rle>            <IRanges>  <Rle> | <integer> <character>
     1    chr19 [58858172, 58858395]      - |    250809        <NA>
     1    chr19 [58858719, 58859006]      - |    250810        <NA>
     1    chr19 [58859832, 58860494]      - |    250811        <NA>
     1    chr19 [58860934, 58862017]      - |    250812        <NA>
     1    chr19 [58861736, 58862017]      - |    250813        <NA>
   ...      ...                  ...    ... .       ...         ...
  9997    chr22 [50961997, 50962853]      - |    266958        <NA>
  9997    chr22 [50963871, 50964033]      - |    266960        <NA>
  9997    chr22 [50963901, 50964034]      - |    266961        <NA>
  9997    chr22 [50964430, 50964570]      - |    266963        <NA>
  9997    chr22 [50964675, 50964905]      - |    266965        <NA>
  -------
  seqinfo: 93 sequences (1 circular) from hg19 genome

Gives you a GRanges object, where the names of the GRanges object are the Entrez Gene ID, which is, I presume, what you wanted for the Gene column.

ADD REPLYlink written 2.3 years ago by James W. MacDonald51k

also as.data.frame() to get the simple data.frame that the user desires.

ADD REPLYlink written 2.3 years ago by Martin Morgan ♦♦ 24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 464 users visited in the last hour