sortSeqlevels is not sorting for GRanges object
1
0
Entering edit mode
Haiying.Kong ▴ 110
@haiyingkong-9254
Last seen 5.7 years ago
Germany

Someone please help me. After applying sortSeqlevels on a GRanges object, the ranges are not sorted.

> class(a)
[1] "GRanges"
attr(,"package")
[1] "GenomicRanges"
> a
GRanges object with 10 ranges and 4 metadata columns:
       seqnames                 ranges strand | sampleName     median      mean
          <Rle>              <IRanges>  <Rle> |   <factor>  <numeric> <numeric>
   [1]        1 [ 12978123,  13001479]      * |    T127192 -1.0000000   -1.3819
   [2]       16 [ 66757067,  66862007]      * |   TBM12913  0.9021002    0.7706
   [3]        6 [153308946, 153311149]      * |      T3503 -0.9817280   -1.1798
   [4]        7 [ 44076308,  44081461]      * |        T34 -0.8688574   -1.1203
   [5]        Y [ 21477864,  24457072]      * |      T2765 -2.5616377   -2.4304
   [6]       10 [ 51361667,  51371756]      * |    T129795 -5.3192409   -4.2112
   [7]       15 [100330954, 100348572]      * |   TBM12913 -0.9999975   -1.3341
   [8]       10 [105882524, 105963528]      * |     T74139  0.4485208    0.5094
   [9]       22 [ 42522572,  42546862]      * |     T10380 -0.9801241   -1.1755
  [10]       11 [ 77531568,  77633436]      * |    T130307  0.6735005    0.6745
                CN
       <character>
   [1]         CN1
   [2]         CN3
   [3]         CN1
   [4]         CN1
   [5]         CN0
   [6]         CN0
   [7]         CN1
   [8]         CN3
   [9]         CN1
  [10]         CN3
  -------
  seqinfo: 24 sequences from an unspecified genome; no seqlengths
> sortSeqlevels(a)
GRanges object with 10 ranges and 4 metadata columns:
       seqnames                 ranges strand | sampleName     median      mean
          <Rle>              <IRanges>  <Rle> |   <factor>  <numeric> <numeric>
   [1]        1 [ 12978123,  13001479]      * |    T127192 -1.0000000   -1.3819
   [2]       16 [ 66757067,  66862007]      * |   TBM12913  0.9021002    0.7706
   [3]        6 [153308946, 153311149]      * |      T3503 -0.9817280   -1.1798
   [4]        7 [ 44076308,  44081461]      * |        T34 -0.8688574   -1.1203
   [5]        Y [ 21477864,  24457072]      * |      T2765 -2.5616377   -2.4304
   [6]       10 [ 51361667,  51371756]      * |    T129795 -5.3192409   -4.2112
   [7]       15 [100330954, 100348572]      * |   TBM12913 -0.9999975   -1.3341
   [8]       10 [105882524, 105963528]      * |     T74139  0.4485208    0.5094
   [9]       22 [ 42522572,  42546862]      * |     T10380 -0.9801241   -1.1755
  [10]       11 [ 77531568,  77633436]      * |    T130307  0.6735005    0.6745
                CN
       <character>
   [1]         CN1
   [2]         CN3
   [3]         CN1
   [4]         CN1
   [5]         CN0
   [6]         CN0
   [7]         CN1
   [8]         CN3
   [9]         CN1
  [10]         CN3
  -------
  seqinfo: 24 sequences from an unspecified genome; no seqlengths

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: openSUSE 13.1 (Bottle) (x86_64)

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
[1] Biobase_2.34.0       BiocInstaller_1.24.0 rtracklayer_1.34.2
[4] cn.mops_1.20.1       GenomicRanges_1.26.4 GenomeInfoDb_1.10.3
[7] IRanges_2.8.2        S4Vectors_0.12.2     BiocGenerics_0.20.0

loaded via a namespace (and not attached):
 [1] lattice_0.20-35            XML_3.98-1.9
 [3] Rsamtools_1.26.2           Biostrings_2.42.1
 [5] snow_0.4-2                 GenomicAlignments_1.10.1
 [7] bitops_1.0-6               grid_3.3.3
 [9] exomeCopy_1.20.0           zlibbioc_1.20.0
[11] XVector_0.14.1             Matrix_1.2-10
[13] BiocParallel_1.8.2         tools_3.3.3
[15] RCurl_1.95-4.8             SummarizedExperiment_1.4.0

GRanges • 1.5k views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 3.0 years ago
United States

That's expected. It only sorts the seqlevels, not the ranges. To sort the ranges, call sort(a) after calling sortSeqlevels(), but note that will sort by more than just the seqnames.

ADD COMMENT
0
Entering edit mode

Thank you very much for your quick reply.

sort(a) did the job.

In cn.mops source code:

            cnvrR <- GenomicRanges::reduce(GRanges(seqnames = segDfSubset$chr,
                IRanges(segDfSubset$start, segDfSubset$end),
                seqinfo = seqinfo(grAllRegions)))
            cnvrR <- sortSeqlevels(cnvrR)

  I thought cnvrR is GRanges.

ADD REPLY

Login before adding your answer.

Traffic: 528 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6