Search
Question: From a GRanges object to a GRangeList
1
gravatar for Patrick Schorderet
3.4 years ago by
United States
Patrick Schorderet20 wrote:

Hi All,

I'm trying to create a GRangesList from a GRanges object. I basically have a Granges object called miRNA. It has 302 ranges. I also have a function that requires me to feed it with a GrangesList, so I am trying to alter the Granges object to a list format with each element of the list containing a single range I am sure there is an easy way to do this, but I can't wrap my head around it.

Thanks for any help.

 

> miRNA
GRanges object with 304 ranges and 3 metadata columns:
        seqnames               ranges strand   |     tx_name         gene_id     tx_type
           <Rle>            <IRanges>  <Rle>   | <character> <CharacterList> <character>
    [1]       2L   [ 857596,  857617]      +   | FBtr0304183     FBgn0262177       miRNA
    [2]       2L   [2737568, 2737589]      +   | FBtr0309710     FBgn0263564       miRNA
    [3]       2L   [3767667, 3767688]      +   | FBtr0304276     FBgn0262392       miRNA
    [4]       2L   [4343736, 4343756]      +   | FBtr0304206     FBgn0262375       miRNA
    [5]       2L   [5068596, 5068617]      +   | FBtr0304262     FBgn0262213       miRNA
    ...      ...                  ...    ... ...         ...             ...         ...
  [300]        X [16191270, 16191291]      -   | FBtr0304409     FBgn0262283       miRNA
  [301]        X [19169201, 19169222]      -   | FBtr0309726     FBgn0263571       miRNA
  [302]        X [21296386, 21296407]      -   | FBtr0304484     FBgn0262454       miRNA
  [303]        X [21298279, 21298300]      -   | FBtr0309696     FBgn0263559       miRNA
  [304]        X [22833545, 22833566]      -   | FBtr0304314     FBgn0262237       miRNA
  -------
  seqinfo: 1870 sequences from an unspecified genome

 

 

 

 

 

 

ADD COMMENTlink modified 3.4 years ago by James W. MacDonald48k • written 3.4 years ago by Patrick Schorderet20
1
gravatar for James W. MacDonald
3.4 years ago by
United States
James W. MacDonald48k wrote:

It's as obvious as you might think:

> mirnas <- microRNAs(TxDb.Hsapiens.UCSC.hg19.knownGene)
Loading required package: mirbase.db
> mirnas
GRanges object with 1595 ranges and 1 metadata column:
         seqnames                 ranges strand   |       mirna_id
            <Rle>              <IRanges>  <Rle>   |    <character>
     [1]     chr1     [  30366,   30503]      +   | hsa-mir-1302-2
     [2]     chr1     [ 567705,  567793]      -   |   hsa-mir-6723
     [3]     chr1     [1102484, 1102578]      +   |   hsa-mir-200b
     [4]     chr1     [1103243, 1103332]      +   |   hsa-mir-200a
     [5]     chr1     [1104385, 1104467]      +   |    hsa-mir-429
     ...      ...                    ...    ... ...            ...
  [1591]     chrX [154115635, 154115733]      -   | hsa-mir-1184-1
  [1592]     chrX [154612749, 154612847]      -   | hsa-mir-1184-2
  [1593]     chrX [154687178, 154687276]      +   | hsa-mir-1184-3
  [1594]     chrY [  1362811,   1362885]      +   | hsa-mir-3690-2
  [1595]     chrY [  2477232,   2477295]      +   | hsa-mir-6089-2
  -------
  seqinfo: 93 sequences (1 circular) from hg19 genome
> as(mirnas, "GRangesList")
GRangesList object of length 1595:
[[1]]
GRanges object with 1 range and 1 metadata column:
      seqnames         ranges strand |       mirna_id
         <Rle>      <IRanges>  <Rle> |    <character>
  [1]     chr1 [30366, 30503]      + | hsa-mir-1302-2

[[2]]
GRanges object with 1 range and 1 metadata column:
      seqnames           ranges strand |     mirna_id
  [1]     chr1 [567705, 567793]      - | hsa-mir-6723

[[3]]
GRanges object with 1 range and 1 metadata column:
      seqnames             ranges strand |     mirna_id
  [1]     chr1 [1102484, 1102578]      + | hsa-mir-200b

...
<1592 more elements>
-------
seqinfo: 93 sequences (1 circular) from hg19 genome
>
ADD COMMENTlink written 3.4 years ago by James W. MacDonald48k

Great, that's exactly what I was looking for.

Is there an easy way to use the mirna_id as names for the list elements? and do this in one go?

ADD REPLYlink written 3.4 years ago by Patrick Schorderet20
> mirs <- microRNAs(TxDb.Hsapiens.UCSC.hg19.knownGene)
> nam <- mcols(mirs)[,1]
> mirs <- as(mirs, "GRangesList")
> names(mirs) <- nam
> mirs
GRangesList object of length 1595:
$hsa-mir-1302-2
GRanges object with 1 range and 1 metadata column:
      seqnames         ranges strand |       mirna_id
         <Rle>      <IRanges>  <Rle> |    <character>
  [1]     chr1 [30366, 30503]      + | hsa-mir-1302-2

$hsa-mir-6723
GRanges object with 1 range and 1 metadata column:
      seqnames           ranges strand |     mirna_id
  [1]     chr1 [567705, 567793]      - | hsa-mir-6723

$hsa-mir-200b
GRanges object with 1 range and 1 metadata column:
      seqnames             ranges strand |     mirna_id
  [1]     chr1 [1102484, 1102578]      + | hsa-mir-200b

...
<1592 more elements>
-------
seqinfo: 93 sequences (1 circular) from hg19 genome
>

 

ADD REPLYlink written 3.4 years ago by James W. MacDonald48k

Two other solutions

grl1 = setNames(as(mirs, "GRangesList"), mirs$mirna_id)
split(mirs, mirs$mirna_id)
## also: splitAsList(mirs, mirs$mirna_id)

and for the first a sanity check

> identical(names(grl1), unlist(grl1)$mirna_id)
[1] TRUE

 

 

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by Martin Morgan ♦♦ 22k

But please be aware that using split() doesn't give you the same result:

grl2 <- split(mirs, mirs$mirna_id)
> identical(grl1, grl2)
[1] FALSE

This is because the mirna_id metadata column contains some duplicates, which causes split() to generate some list elements with more than 1 range in them:

> table(elementLengths(grl2))
   1    2 
1593    1 

> grl2[elementLengths(grl2) != 1]
GRangesList object of length 1:
$hsa-mir-6511b-1 
GRanges object with 2 ranges and 1 metadata column:
      seqnames               ranges strand |        mirna_id
         <Rle>            <IRanges>  <Rle> |     <character>
  [1]    chr16 [ 2156670,  2156754]      - | hsa-mir-6511b-1
  [2]    chr16 [15227923, 15228007]      - | hsa-mir-6511b-1

-------
seqinfo: 93 sequences (1 circular) from hg19 genome

This might or might not be what you want.

H.

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by Hervé Pagès ♦♦ 13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 234 users visited in the last hour