Dear Bioconductors,
This is a bit of a curiosity question. I have been working with TxDb.Hsapiens.UCSC.hg19.knownGene package and noticed that there are some exons, that do not seem to be a part of any gene.
> # get all the genes
> genic.regions <- genes(TxDb.Hsapiens.UCSC.hg19.knownGene)
> # get all the exons
> exonic.regions <- exons(TxDb.Hsapiens.UCSC.hg19.knownGene)
> # Find the overlaps between the genes and exons
> findOverlaps(genic.regions, exonic.regions)
Hits object with 270213 hits and 0 metadata columns:
queryHits subjectHits
<integer> <integer>
[1] 1 250809
[2] 1 250810
[3] 1 250811
[4] 1 250812
[5] 1 250813
... ... ...
[270209] 23056 266961
[270210] 23056 266962
[270211] 23056 266963
[270212] 23056 266964
[270213] 23056 266965
-------
queryLength: 23056
subjectLength: 289969
As you can see, there are nearly 290000 exons, but only about 270000 overlap with any of the genes. I can see it very clearly, if I try to plot genes and exons overlapping a fragment of a chromosome. There's a few exons (marked by the green triangle) that do not appear to be part of any gene. So my question is, what might they be and how I should deal with them if, for instance, I'm trying to get coordinates of the intronic or intergenic regions?


I don't think your images are showing, if you have any.
Thanks, fixed it.