Bug in IRanges package?
1
0
Entering edit mode
Jim Nemesh ▴ 10
@jim-nemesh-4027
Last seen 9.6 years ago
Hi! I apologize if this is the wrong place to post a bug report - the maintainer of the package is listed as: Maintainer Biocore Team c/o BioC user list I have a small set of range info: start=c(187592960,246804638,193202778,151026713,150822234) end=c(187810199,246862141,193212714,151037738,150856715) mapRanges<-IRanges(start, end) tree<-IntervalTree(mapRanges) If I search the data against itself like this: result=overlap(tree, mapRanges)@matchMatrix query subject [1,] 5 5 [2,] 4 4 [3,] 1 1 [4,] 3 3 [5,] 2 2 You can see each query matches only the subject - all is well in the world so far. But, if you do it this way: result2=overlap(tree) An object of class “RangesMatching” Slot "matchMatrix": query subject [1,] 1 5 [2,] 2 4 [3,] 3 1 [4,] 4 3 [5,] 5 2 It looks item 1 is matching up to item 5, etc. Not so great. if we sort the data first: start=sort(start) end=sort(end) mapRanges<-IRanges(start, end) tree<-IntervalTree(mapRanges) result2=overlap(tree) > result2 An object of class “RangesMatching” Slot "matchMatrix": query subject [1,] 1 1 [2,] 2 2 [3,] 3 3 [4,] 4 4 [5,] 5 5 Things are once again right in the world. I can't find anywhere in the documents where sorting is a requirement to sort data when using a self-referential search. Is this a bug in the documentation, or in the code itself? I would assume that the IntervalTree class would perform sorting if it's a requirement for the code to work correctly. Otherwise, fantastic package that's saved me quite a bit of time. Thanks -Jim Nemesh [[alternative HTML version deleted]]
• 879 views
ADD COMMENT
0
Entering edit mode
Patrick Aboyoun ★ 1.6k
@patrick-aboyoun-6734
Last seen 9.6 years ago
United States
Jim, I was able to reproduce your IRanges bug report for BioC 2.5 / R 2.10 (please provide sessionInfo() information in the future to help others reproduce your results). The BioC 2.5 daily builds are no longer running so that bug will remain, but if you upgrade to R 2.11 beta and grab the latest IRanges package, you will see that this bug is fixed. Patrick > start <- c(187592960,246804638,193202778,151026713,150822234) > end <- c(187810199,246862141,193212714,151037738,150856715) > mapRanges <- IRanges(start, end) > tree <- IntervalTree(mapRanges) > > findOverlaps(mapRanges, tree) An object of class "RangesMatching" Slot "matchMatrix": query subject [1,] 1 1 [2,] 2 2 [3,] 3 3 [4,] 4 4 [5,] 5 5 Slot "DIM": [1] 5 5 > findOverlaps(tree) An object of class "RangesMatching" Slot "matchMatrix": query subject [1,] 1 1 [2,] 2 2 [3,] 3 3 [4,] 4 4 [5,] 5 5 Slot "DIM": [1] 5 5 > sessionInfo() R version 2.11.0 beta (2010-04-12 r51689) i386-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] IRanges_1.5.78 loaded via a namespace (and not attached): [1] tools_2.11.0 On 4/16/10 1:26 PM, Jim Nemesh wrote: > Hi! I apologize if this is the wrong place to post a bug report - the > maintainer of the package is listed as: Maintainer Biocore Team c/o > BioC user list > > I have a small set of range info: > > start=c(187592960,246804638,193202778,151026713,150822234) > end=c(187810199,246862141,193212714,151037738,150856715) > mapRanges<-IRanges(start, end) > tree<-IntervalTree(mapRanges) > > If I search the data against itself like this: > result=overlap(tree, mapRanges)@matchMatrix > > query subject > [1,] 5 5 > [2,] 4 4 > [3,] 1 1 > [4,] 3 3 > [5,] 2 2 > > You can see each query matches only the subject - all is well in the > world so far. > > But, if you do it this way: > > result2=overlap(tree) > An object of class "RangesMatching" > Slot "matchMatrix": > query subject > [1,] 1 5 > [2,] 2 4 > [3,] 3 1 > [4,] 4 3 > [5,] 5 2 > > > It looks item 1 is matching up to item 5, etc. Not so great. > > if we sort the data first: > start=sort(start) > end=sort(end) > mapRanges<-IRanges(start, end) > tree<-IntervalTree(mapRanges) > result2=overlap(tree) > > > result2 > An object of class "RangesMatching" > Slot "matchMatrix": > query subject > [1,] 1 1 > [2,] 2 2 > [3,] 3 3 > [4,] 4 4 > [5,] 5 5 > > Things are once again right in the world. > > I can't find anywhere in the documents where sorting is a requirement > to sort data when using a self-referential search. Is this a bug in > the documentation, or in the code itself? I would assume that the > IntervalTree class would perform sorting if it's a requirement for the > code to work correctly. > > Otherwise, fantastic package that's saved me quite a bit of time. > > Thanks > > -Jim Nemesh > [[alternative HTML version deleted]] > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6