Question

Loading alternative genomes into ggbio

0

Entering edit mode

daniel.antony.pass • 0

@danielantonypass-7717

Last seen 10.2 years ago

United Kingdom

I have been trying out ggbio for the karyogram figure generation, and everything works with the hg19 dataset as in the manual, but I don't understand how to load my own species of interest or which format the data has to be in.

Advice would be appreciated!

What I've used for the hg19:

data(hg19IdeogramCyto, package = "biovizBase")
hg19 <- keepSeqlevels(hg19IdeogramCyto, paste0("chr", c(1:22, "X", "Y")))
autoplot(hg19, layout = "karyogram", cytoband = TRUE)

Thanks

ggbio • 2.5k views

ADD COMMENT • link 10.2 years ago daniel.antony.pass • 0

score 1 · Answer 1 · 2015-09-11

You could always look at the data you are currently using and infer from that what is expected, no?

> hg19IdeogramCyto
GRanges object with 862 ranges and 2 metadata columns:
        seqnames               ranges strand   |     name gieStain
           <Rle>            <IRanges>  <Rle>   | <factor> <factor>
    [1]     chr1  [      0,  2300000]      *   |   p36.33     gneg
    [2]     chr1  [2300000,  5400000]      *   |   p36.32   gpos25
    [3]     chr1  [5400000,  7200000]      *   |   p36.31     gneg
    [4]     chr1  [7200000,  9200000]      *   |   p36.23   gpos25
    [5]     chr1  [9200000, 12700000]      *   |   p36.22     gneg
    ...      ...                  ...    ... ...      ...      ...
  [858]     chrY [15100000, 19800000]      *   |  q11.221   gpos50
  [859]     chrY [19800000, 22100000]      *   |  q11.222     gneg
  [860]     chrY [22100000, 26200000]      *   |  q11.223   gpos50
  [861]     chrY [26200000, 28800000]      *   |   q11.23     gneg
  [862]     chrY [28800000, 59373566]      *   |      q12     gvar
  -------
  seqinfo: 24 sequences from an unspecified genome; no seqlengths

So you need a GRanges with this extra metadata. Let's look at AnnotationHub.

> library(AnnotationHub)
> hub <- AnnotationHub()

> cyto <- query(hub, c("cytoband"))

> cyto
AnnotationHub with 7 records
# snapshotDate(): 2015-08-26
# $dataprovider: UCSC
# $species: Homo sapiens, Drosophila melanogaster, Mus musculus, Rattus norv...
# $rdataclass: GRanges
# additional mcols(): taxonomyid, genome, description, tags, sourceurl,
#   sourcetype
# retrieve records with, e.g., 'object[["AH5012"]]'

           title          
  AH5012 | Chromosome Band
  AH5129 | Chromosome Band
  AH5292 | Chromosome Band
  AH5416 | Chromosome Band
  AH6158 | Chromosome Band
  AH6379 | Chromosome Band
  AH6810 | Chromosome Band
> cyto$species
[1] "Homo sapiens"            "Homo sapiens"           
[3] "Homo sapiens"            "Homo sapiens"           
[5] "Mus musculus"            "Rattus norvegicus"      
[7] "Drosophila melanogaster"

> cyto[[1]]
require(\u201cGenomicRanges\u201d)
retrieving 1 resources
  |======================================================================| 100%
UCSC track 'cytoBand'
UCSCData object with 862 ranges and 1 metadata column:
        seqnames               ranges strand   |        name
           <Rle>            <IRanges>  <Rle>   | <character>
    [1]     chr1  [      1,  2300000]      *   |      p36.33
    [2]     chr1  [2300001,  5400000]      *   |      p36.32
    [3]     chr1  [5400001,  7200000]      *   |      p36.31
    [4]     chr1  [7200001,  9200000]      *   |      p36.23
    [5]     chr1  [9200001, 12700000]      *   |      p36.22
    ...      ...                  ...    ... ...         ...
  [858]    chr22 [37600001, 41000000]      *   |       q13.1
  [859]    chr22 [41000001, 44200000]      *   |       q13.2
  [860]    chr22 [44200001, 48400000]      *   |      q13.31
  [861]    chr22 [48400001, 49400000]      *   |      q13.32
  [862]    chr22 [49400001, 51304566]      *   |      q13.33
  -------
  seqinfo: 93 sequences from hg19 genome
There were 33 warnings (use warnings() to see them)

I don' t know what you mean by 'my own species'. That's presumably human, but maybe I am being too literal ;-D. Anyway if you care about human, mouse, rat, or fly, you are like 75% of the way there. All you need is the staining information, which is presumably somewhere that a google search can go.