Hello,
I am going to do some mapping from gene to their chromosome regions. And I found some gene with different chromosome regions as in UCSC genome browser. For example, gene 4499 (MT1M) is in chr16:56632233-56633986 in UCSC genome browser. But in R, it is in chr16:56617461-56633986, because it has two transcripts in R but only one transcript in UCSC genome browser. "uc002ejn.3" is the same in R as UCSC genome browser, but I can't find "uc010vhe.2" in UCSC genome browser. Any suggestion for it? Thanks!
Here is the R code:
require(TxDb.Hsapiens.UCSC.hg38.knownGene)
require(GenomicRanges)
geneDb=TxDb.Hsapiens.UCSC.hg38.knownGene
allGeneRange<-genes(geneDb)
allGeneRange["4499"]
#GRanges object with 1 range and 1 metadata column:
# seqnames ranges strand | gene_id
# <Rle> <IRanges> <Rle> | <character>
# 4499 chr16 [56617461, 56633986] + | 4499
# -------
# seqinfo: 455 sequences (1 circular) from hg38 genome
txs <- transcriptsBy(TxDb.Hsapiens.UCSC.hg38.knownGene)
txs["4499"]
#GRangesList object of length 1:
#$4499
#GRanges object with 2 ranges and 2 metadata columns:
# seqnames ranges strand | tx_id tx_name
# <Rle> <IRanges> <Rle> | <integer> <character>
# [1] chr16 [56617461, 56633986] + | 68646 uc010vhe.2
# [2] chr16 [56632622, 56633986] + | 68649 uc002ejn.3
#-------
#seqinfo: 455 sequences (1 circular) from hg38 genome
sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.7 (Final)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] TxDb.Hsapiens.UCSC.hg38.knownGene_3.1.3
[2] GenomicFeatures_1.24.5
[3] AnnotationDbi_1.34.4
[4] Biobase_2.32.0
[5] GenomicRanges_1.24.3
[6] GenomeInfoDb_1.8.7
[7] IRanges_2.6.1
[8] S4Vectors_0.10.3
[9] BiocGenerics_0.18.0
loaded via a namespace (and not attached):
[1] XML_3.98-1.4 Rsamtools_1.24.0
[3] Biostrings_2.40.2 GenomicAlignments_1.8.4
[5] bitops_1.0-6 DBI_0.5-1
[7] RSQLite_1.0.0 zlibbioc_1.18.0
[9] XVector_0.12.1 BiocParallel_1.6.6
[11] tools_3.3.1 biomaRt_2.28.0
[13] RCurl_1.95-4.8 rtracklayer_1.32.2
[15] SummarizedExperiment_1.2.3