How can I find the location of a gene based on cytogenetic bands (e.g. 7q31.2)?
2
2
Entering edit mode
@arman-shahrisa-7713
Last seen 5.4 years ago

How can I find the location of a gene based on cytogenetic bands (e.g. 7q31.2)? Is there any package(s) that lets me convert the gene symbol to cytogenetic bands?

bioconductor cytogenetics • 1.7k views
ADD COMMENT
5
Entering edit mode
@martin-morgan-1513
Last seen 6 weeks ago
United States

Organism.dplyr provides another fun option, exposing the org and TxDb databases as dplyr objects

> library(Organism.dplyr)
> src = src_ucsc("Human")
using org.Hs.eg.db, TxDb.Hsapiens.UCSC.hg38.knownGene
> src
src:  sqlite 3.19.3 [/home/mtmorgan/.cache/BiocFileCache/5b377bf41425_5b377bf41425]
tbls: id, id_accession, id_go, id_go_all, id_omim_pm, id_protein,
  id_transcript, ranges_cds, ranges_exon, ranges_gene, ranges_tx
> tbl(src, "id") %>% filter(symbol %like% "BRCA%") %>% dplyr::select(map, symbol) %>% distinct()
# Source:   lazy query [?? x 2]
# Database: sqlite 3.19.3
#   [/home/mtmorgan/.cache/BiocFileCache/5b377bf41425_5b377bf41425]
       map  symbol
     <chr>   <chr>
1     5p12 BRCAT54
2 17q21.31 BRCA1P1
3    13q21   BRCA3
4 17q21.31   BRCA1
5  13q13.1   BRCA2
6    11q23  BRCATA

 

ADD COMMENT
2
Entering edit mode

I forgot about the MAP column. You can get this directly without having to engage with any of that tidyverse nonsense <shakes cane at kids on lawn>.

> select(Homo.sapiens, "BRCA2", "MAP", "SYMBOL")
'select()' returned 1:1 mapping between keys and columns
  SYMBOL     MAP
1  BRCA2 13q13.1
ADD REPLY
2
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States

The biovizBase package has a GRanges of the hg19 cytogenetic bands, so you could use that. There is also a GRanges with gene symbols in that package which makes it super convenient. Let's say we want the cytoband for BRCA2:

> library(biovizBase)
> data(hg19IdeogramCyto)
> data(genesymbol)
> subsetByOverlaps(hg19IdeogramCyto, genesymbol["BRCA2",])
GRanges object with 1 range and 2 metadata columns:
      seqnames               ranges strand |     name gieStain
         <Rle>            <IRanges>  <Rle> | <factor> <factor>
  [1]    chr13 [32200000, 34000000]      * |    q13.1   gpos50
  -------
  seqinfo: 24 sequences from an unspecified genome; no seqlengths

So BRCA2 is in 13q13.1. Is that what you are after?

ADD COMMENT
0
Entering edit mode

Exactly! Thank you very much.

ADD REPLY

Login before adding your answer.

Traffic: 801 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6