Question: coverage() in IRanges gives >0 coverage when weight == 0
0
gravatar for Ludo Pagie
6.0 years ago by
Ludo Pagie40
Ludo Pagie40 wrote:
Hi all, I don't understand what's going on here.I'm trying to compute the coverage of a few ranges. I specify a weight for these ranges and for some positions I get a coverage even though I know these positions are only covered by ranges which have 0 weight. Worse, the value I do get for these positions depends on the number of ranges I feed into the function. Here's some code using extracts of the data I'm working on: ################################################################ ################################################################ # funny way to make the same RangedData object I have ranges <- as(structure(list(space = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L ), .Label = "CH321-10J13_YELLOW", class = "factor"), start = c(13449753L, 13449972L, 13449972L, 13450037L, 13450038L, 13450039L, 13450039L, 13450042L, 13450169L, 13450193L, 13450198L, 13450253L, 13450284L, 13450296L, 13450308L, 13450308L, 13450522L, 13450690L, 13450690L, 13450870L, 13450997L, 13451104L, 13451133L, 13451133L, 13451142L, 13451147L, 13451215L, 13451216L, 13451513L, 13451788L, 13451812L, 13451812L, 13451902L, 13451961L, 13451961L, 13452337L, 13452955L, 13453049L, 13453053L), end = c(13451753L, 13451573L, 13451573L, 13451908L, 13451921L, 13451700L, 13451700L, 13451675L, 13452208L, 13451895L, 13451971L, 13451731L, 13451553L, 13452172L, 13451502L, 13451502L, 13451806L, 13452630L, 13452630L, 13452711L, 13452527L, 13452571L, 13451378L, 13451378L, 13452533L, 13453030L, 13452281L, 13452439L, 13453079L, 13453360L, 13452603L, 13452603L, 13453506L, 13452789L, 13452789L, 13453123L, 13453484L, 13453773L, 13453387L ), width = c(2001L, 1602L, 1602L, 1872L, 1884L, 1662L, 1662L, 1634L, 2040L, 1703L, 1774L, 1479L, 1270L, 1877L, 1195L, 1195L, 1285L, 1941L, 1941L, 1842L, 1531L, 1468L, 246L, 246L, 1392L, 1884L, 1067L, 1224L, 1567L, 1573L, 792L, 792L, 1605L, 829L, 829L, 787L, 530L, 725L, 335L), count = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), .Names = c("space", "start", "end", "width", "count"), row.names = c(NA, -39L), class = "data.frame"),'RangedData') # make the weight vector count <- c(3.790678721025, 3.790678721025, 0, 0, 0, 1.263559573675, 2.52711914735, 6.317797868375, 159.20850628305, 0, 156.6813871357, 1.263559573675, 162.999185004075, 2.52711914735, 1139.73073545485, 0, 0, 0, 304.517857255675, 5.0542382947, 1.263559573675, 3.790678721025, 1.263559573675, 0, 6.317797868375, 32.85254891555, 0, 451.090767801975, 0, 17.68983403145, 24.007631899825, 0, 46.751704225975, 0, 0, 0, 0, 0, 0) # my shift argument shift <- structure(list(`CH321-10J13_YELLOW` = -13421338), .Names = "CH321-10J13_YELLOW") # I know base position 13453507 is covered by a single range, which has weight 0 mypos <- 13453507 # which ranges overlap with mypos: which(start(ranges) <= mypos & end(ranges) >= mypos) # 38 # this range has weight 0 count[38] # 0 # compute coverage and report the coverage at mypos: coverage(ranges, weight=list(count))[[1]][mypos] #numeric-Rle of length 1 with 1 run # Lengths: 1 # Values : -5.61328761250479e-13 # if I reduce the number of ranges going into the computation I get different results coverage(ranges[20:39,], weight=list(count[20:39]))[[1]][mypos] #numeric-Rle of length 1 with 1 run # Lengths: 1 # Values : 7.105427357601e-15 coverage(ranges[35:39,], weight=list(count[35:39]))[[1]][mypos] #numeric-Rle of length 1 with 1 run # Lengths: 1 # Values : 0 # sessionInfo: R version 3.0.1 (2013-05-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] IRanges_1.18.3 BiocGenerics_0.6.0 loaded via a namespace (and not attached): [1] stats4_3.0.1 tools_3.0.1 ################################################################ ################################################################ I get the same issue in another R-version: R version 2.15.1 (2012-06-22) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets [6] methods base other attached packages: [1] IRanges_1.16.6 BiocGenerics_0.4.0 [3] BiocInstaller_1.8.3 gdata_2.13.2 [5] gtools_3.0.0 loaded via a namespace (and not attached): [1] KernSmooth_2.23-10 parallel_2.15.1 stats4_2.15.1 [4] tools_2.15.1 Is this the result of precision issues? Or is it a bug? Or am I overlooking something else? Thanks for your help, Ludo
coverage • 487 views
ADD COMMENTlink modified 6.0 years ago by Charles Berry290 • written 6.0 years ago by Ludo Pagie40
Answer: coverage() in IRanges gives >0 coverage when weight == 0
0
gravatar for Charles Berry
6.0 years ago by
Charles Berry290
United States
Charles Berry290 wrote:
Ludo Pagie <ludo.pagie at="" ...=""> writes: > > Hi all, > > I don't understand what's going on here.I'm trying to compute the > coverage of a few ranges. I specify a weight for these ranges and for > some positions I get a coverage even though I know these positions are > only covered by ranges which have 0 weight. > Worse, the value I do get for these positions depends on the number of > ranges I feed into the function. Here's some code using extracts of > the data I'm working on: > See R FAQ 7.31 Why doesn't R think these numbers are equal? coverage() uses floating point arithmetic to get the value and is subject to round-off error. The amount of round-off error depends on the particulars, so you will get different values for (a+b) - a depending on what b is > (1e5+1e-10)-1e5 [1] 1.018634e-10 > (1e6+1e-10)-1e6 [1] 1.164153e-10 >
ADD COMMENTlink written 6.0 years ago by Charles Berry290
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 133 users visited in the last hour