subdivideGRanges() to a given number of ranges
1
0
Entering edit mode
d r ▴ 150
@d-r-5459
Last seen 6.8 years ago
Israel
Hello I have a GRanges object representing CpG islands, that I want to subdivide so that each island will be represented by 10 ranges of equal width. To do this, I created a vector containing the widths of the islands divided by ten: swidth<-as.integer(width(islandshg19)/10) and then called subdivideGRanges(): islandshg19<-subdivideGRanges(islandshg19, subsize=as.integer(width/10)) and got a multitude of warnings. This approach did work when I tried it on a GRanges object containing a single range, and a width vector of length 1. When I tried to extend the sample to include two objetcs, I got these warning message: 1: In if (widths[i] < 2 * subsize) { ... : the condition has length > 1 and only the first element will be used 2: In if (width < 2 * subsize) { ... : the condition has length > 1 and only the first element will be used 3: In 0:(nchunks - 1) : numerical expression has 2 elements: only the first used Which makes me think that the problem may be related to assigning the right subszie to the the rigth range. Does anyone have an idea how to do that? Thanks in advance Dolev Rahat sessionInfo(if required): R version 2.15.1 (2012-06-22) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] exomeCopy_1.2.0 Rsamtools_1.8.6 Biostrings_2.24.1 rtracklayer_1.16.3 taRifx_1.0.4 reshape2_1.2.1 plyr_1.7.1 [8] XML_3.9-4.1 GenomicRanges_1.9.54 IRanges_1.15.40 BiocGenerics_0.3.1 BiocInstaller_1.4.7 loaded via a namespace (and not attached): [1] bitops_1.0-4.1 BSgenome_1.25.7 RCurl_1.91-1.1 stats4_2.15.1 stringr_0.6.1 tools_2.15.1 zlibbioc_1.2.0 [[alternative HTML version deleted]]
• 1.5k views
ADD COMMENT
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.9 years ago
United States
Hi Dolev, I am copying the maintainer of exomeCopy which is the package subdivideGRanges() comes from. Please provide a small reproducable example of the problem you are having (i.e., a few lines of code that someone else can run to get the same error). This makes it easier for others to help. Valerie On 09/06/12 04:39, d r wrote: > Hello > > I have a GRanges object representing CpG islands, that I want to subdivide > so that each island will be represented by 10 ranges of equal width. > > To do this, I created a vector containing the widths of the islands divided > by ten: > > swidth<-as.integer(width(islandshg19)/10) > > > > and then called subdivideGRanges(): > > > islandshg19<-subdivideGRanges(islandshg19, subsize=as.integer(width/10)) > > > and got a multitude of warnings. > > > This approach did work when I tried it on a GRanges object containing a > single range, and a width vector of length 1. When I tried to extend the > sample to include two objetcs, I got these warning message: > > 1: In if (widths[i]< 2 * subsize) { ... : > > the condition has length> 1 and only the first element will be used > > 2: In if (width< 2 * subsize) { ... : > > the condition has length> 1 and only the first element will be used > > 3: In 0:(nchunks - 1) : numerical expression has 2 elements: only the first used > > > > Which makes me think that the problem may be related to assigning the right > subszie to the the rigth range. > > Does anyone have an idea how to do that? > > Thanks in advance > > Dolev Rahat > > > > sessionInfo(if required): > > R version 2.15.1 (2012-06-22) > > Platform: x86_64-pc-mingw32/x64 (64-bit) > > > > locale: > > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 LC_MONETARY=English_United States.1252 > > [4] LC_NUMERIC=C LC_TIME=English_United > States.1252 > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] exomeCopy_1.2.0 Rsamtools_1.8.6 Biostrings_2.24.1 > rtracklayer_1.16.3 taRifx_1.0.4 reshape2_1.2.1 > plyr_1.7.1 > > [8] XML_3.9-4.1 GenomicRanges_1.9.54 IRanges_1.15.40 > BiocGenerics_0.3.1 BiocInstaller_1.4.7 > > > > loaded via a namespace (and not attached): > > [1] bitops_1.0-4.1 BSgenome_1.25.7 RCurl_1.91-1.1 stats4_2.15.1 > stringr_0.6.1 tools_2.15.1 zlibbioc_1.2.0 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Hi Dolev, The subdivideGRanges() function cannot accomplish this task. from the manual: "Takes an input GRanges object and, splits each range into multiple ranges of nearly equal width. For an input range of width w and subdividing size s, it will subdivide the range into max(1,floor(w/s)) nearly equal width ranges." So the function tries to keep the widths of the subdivided ranges around s basepairs, rather than keeping the number of subdivided ranges constant. I will include a more helpful error message if a vector is provided for the subsize argument. cheers, Mike On 09/06/12 16:02, Valerie Obenchain wrote: > Hi Dolev, > > I am copying the maintainer of exomeCopy which is the package > subdivideGRanges() comes from. Please provide a small reproducable > example of the problem you are having (i.e., a few lines of code that > someone else can run to get the same error). This makes it easier for > others to help. > > Valerie > > On 09/06/12 04:39, d r wrote: >> Hello >> >> I have a GRanges object representing CpG islands, that I want to >> subdivide >> so that each island will be represented by 10 ranges of equal width. >> >> To do this, I created a vector containing the widths of the islands >> divided >> by ten: >> >> swidth<-as.integer(width(islandshg19)/10) >> >> >> >> and then called subdivideGRanges(): >> >> >> islandshg19<-subdivideGRanges(islandshg19, subsize=as.integer(width/10)) >> >> >> and got a multitude of warnings. >> >> >> This approach did work when I tried it on a GRanges object containing a >> single range, and a width vector of length 1. When I tried to extend the >> sample to include two objetcs, I got these warning message: >> >> 1: In if (widths[i]< 2 * subsize) { ... : >> >> the condition has length> 1 and only the first element will be used >> >> 2: In if (width< 2 * subsize) { ... : >> >> the condition has length> 1 and only the first element will be used >> >> 3: In 0:(nchunks - 1) : numerical expression has 2 elements: only the >> first used >> >> >> >> Which makes me think that the problem may be related to assigning the >> right >> subszie to the the rigth range. >> >> Does anyone have an idea how to do that? >> >> Thanks in advance >> >> Dolev Rahat >> >> >> >> sessionInfo(if required): >> >> R version 2.15.1 (2012-06-22) >> >> Platform: x86_64-pc-mingw32/x64 (64-bit) >> >> >> >> locale: >> >> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >> States.1252 LC_MONETARY=English_United States.1252 >> >> [4] LC_NUMERIC=C LC_TIME=English_United >> States.1252 >> >> >> >> attached base packages: >> >> [1] stats graphics grDevices utils datasets methods base >> >> >> >> other attached packages: >> >> [1] exomeCopy_1.2.0 Rsamtools_1.8.6 Biostrings_2.24.1 >> rtracklayer_1.16.3 taRifx_1.0.4 reshape2_1.2.1 >> plyr_1.7.1 >> >> [8] XML_3.9-4.1 GenomicRanges_1.9.54 IRanges_1.15.40 >> BiocGenerics_0.3.1 BiocInstaller_1.4.7 >> >> >> >> loaded via a namespace (and not attached): >> >> [1] bitops_1.0-4.1 BSgenome_1.25.7 RCurl_1.91-1.1 stats4_2.15.1 >> stringr_0.6.1 tools_2.15.1 zlibbioc_1.2.0 >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Thank you Mike and Valarie, Just to be sure I understood you clearly: I understand there is no possibilty to use subdivideGRanges() to split a GRanges to a given number of elements. If I understood you correctly, it is also not possible to read a vector into the subsize argument. is taht true or is ther some possibilty to do that? thanks again Dolev On Thu, Sep 6, 2012 at 5:17 PM, Michael Love <love@molgen.mpg.de> wrote: > Hi Dolev, > > The subdivideGRanges() function cannot accomplish this task. > > from the manual: > > "Takes an input GRanges object and, splits each range into multiple ranges > of nearly equal width. For an input range of width w and subdividing size > s, it will subdivide the range into max(1,floor(w/s)) nearly equal width > ranges." > > So the function tries to keep the widths of the subdivided ranges around s > basepairs, rather than keeping the number of subdivided ranges constant. > > I will include a more helpful error message if a vector is provided for > the subsize argument. > > cheers, > > Mike > > > > On 09/06/12 16:02, Valerie Obenchain wrote: > >> Hi Dolev, >> >> I am copying the maintainer of exomeCopy which is the package >> subdivideGRanges() comes from. Please provide a small reproducable example >> of the problem you are having (i.e., a few lines of code that someone else >> can run to get the same error). This makes it easier for others to help. >> >> Valerie >> >> On 09/06/12 04:39, d r wrote: >> >>> Hello >>> >>> I have a GRanges object representing CpG islands, that I want to >>> subdivide >>> so that each island will be represented by 10 ranges of equal width. >>> >>> To do this, I created a vector containing the widths of the islands >>> divided >>> by ten: >>> >>> swidth<-as.integer(width(**islandshg19)/10) >>> >>> >>> >>> and then called subdivideGRanges(): >>> >>> >>> islandshg19<-subdivideGRanges(**islandshg19, >>> subsize=as.integer(width/10)) >>> >>> >>> and got a multitude of warnings. >>> >>> >>> This approach did work when I tried it on a GRanges object containing a >>> single range, and a width vector of length 1. When I tried to extend the >>> sample to include two objetcs, I got these warning message: >>> >>> 1: In if (widths[i]< 2 * subsize) { ... : >>> >>> the condition has length> 1 and only the first element will be used >>> >>> 2: In if (width< 2 * subsize) { ... : >>> >>> the condition has length> 1 and only the first element will be used >>> >>> 3: In 0:(nchunks - 1) : numerical expression has 2 elements: only the >>> first used >>> >>> >>> >>> Which makes me think that the problem may be related to assigning the >>> right >>> subszie to the the rigth range. >>> >>> Does anyone have an idea how to do that? >>> >>> Thanks in advance >>> >>> Dolev Rahat >>> >>> >>> >>> sessionInfo(if required): >>> >>> R version 2.15.1 (2012-06-22) >>> >>> Platform: x86_64-pc-mingw32/x64 (64-bit) >>> >>> >>> >>> locale: >>> >>> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >>> States.1252 LC_MONETARY=English_United States.1252 >>> >>> [4] LC_NUMERIC=C LC_TIME=English_United >>> States.1252 >>> >>> >>> >>> attached base packages: >>> >>> [1] stats graphics grDevices utils datasets methods base >>> >>> >>> >>> other attached packages: >>> >>> [1] exomeCopy_1.2.0 Rsamtools_1.8.6 Biostrings_2.24.1 >>> rtracklayer_1.16.3 taRifx_1.0.4 reshape2_1.2.1 >>> plyr_1.7.1 >>> >>> [8] XML_3.9-4.1 GenomicRanges_1.9.54 IRanges_1.15.40 >>> BiocGenerics_0.3.1 BiocInstaller_1.4.7 >>> >>> >>> >>> loaded via a namespace (and not attached): >>> >>> [1] bitops_1.0-4.1 BSgenome_1.25.7 RCurl_1.91-1.1 stats4_2.15.1 >>> stringr_0.6.1 tools_2.15.1 zlibbioc_1.2.0 >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**_________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.="" ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> Search the archives: http://news.gmane.org/gmane.** >>> science.biology.informatics.**conductor<http: news.gmane.org="" gman="" e.science.biology.informatics.conductor=""> >>> >> >> > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On 09/06/2012 11:33 AM, d r wrote: > Thank you Mike and Valarie, > > Just to be sure I understood you clearly: I understand there is no > possibilty to use subdivideGRanges() to split a GRanges to a given number > of elements. > If I understood you correctly, it is also not possible to read a vector > into the subsize argument. is taht true or is ther some possibilty to do > that? Not sure if this is the best way, but for a GRanges instance 'gr' dividing it in to 'n' bins, you might start by making sure there won't be any zero-width or necessarily overlapping bins if (any(width(gr) < n)) stop("all 'width(gr)' must be >= 'n'") then figure out the width, in numeric values, of each bin d <- width(gr) / n To make a vector of offsets from the start of each original bin, you want to do this in a 'vectorized' fashion, rather than a loop, so I replicated each element of d n times, calculated the cumulative sum, and then subtracted for each of the d x n elements the value of the cumulative sum at the start of each replicate, a little obscure I guess... dd <- cumsum(rep(d, each=n)) mask <- logical(n); mask[1] <- TRUE # every n'th is base for range dd <- dd - rep(dd[mask], each=n) To get the new starts, replicate the original starts and add the offsets, using round() to approximate to the nearest integer value starts <- round(rep(start(gr), each=n) + dd) the ends will be the starts of the next range, minus 1 (the 'L' says to represent 0 and 1 as integer values, rather than numeric, and avoids unnecessary coercion from integer to numeric and back) ends <- c(starts[-1] - 1L, 0L) except for each of the n'th (including last) elements, which will be the end of the original range ends[rev(mask)] <- end(gr) we can then replicate our original data and update the ranges gr <- gr[rep(seq_along(gr), each=n)] ranges(gr) <- IRanges(starts, ends) getting the result gr Here's the above as a function intoNbins <- function(gr, n = 10) { if (any(width(gr) < n)) stop("all 'width(gr)' must be >= 'n'") d <- width(gr) / n dd <- cumsum(rep(d, each=n)) mask <- logical(n); mask[1] <- TRUE dd <- dd - rep(dd[mask], each=n) starts <- round(rep(start(gr), each=n) + dd) ends <- c(starts[-1], 0) - 1L ends[rev(mask)] <- end(gr) gr <- gr[rep(seq_along(gr), each=n)] ranges(gr) <- IRanges(starts, ends) gr } and in action > gr <- GRanges("A", IRanges(c(1, 21, 21), width=c(10, 20, 30))) > table(width(intoNbins(gr, 10))) 1 2 3 10 10 10 It might make sense to split the GRanges into a GRangesList, one element for each CpG island, replacing the last line of intoNbins with split(gr, rep(seq_along(d), each=n)) Martin > > thanks again > Dolev > > On Thu, Sep 6, 2012 at 5:17 PM, Michael Love <love at="" molgen.mpg.de=""> wrote: > >> Hi Dolev, >> >> The subdivideGRanges() function cannot accomplish this task. >> >> from the manual: >> >> "Takes an input GRanges object and, splits each range into multiple ranges >> of nearly equal width. For an input range of width w and subdividing size >> s, it will subdivide the range into max(1,floor(w/s)) nearly equal width >> ranges." >> >> So the function tries to keep the widths of the subdivided ranges around s >> basepairs, rather than keeping the number of subdivided ranges constant. >> >> I will include a more helpful error message if a vector is provided for >> the subsize argument. >> >> cheers, >> >> Mike >> >> >> >> On 09/06/12 16:02, Valerie Obenchain wrote: >> >>> Hi Dolev, >>> >>> I am copying the maintainer of exomeCopy which is the package >>> subdivideGRanges() comes from. Please provide a small reproducable example >>> of the problem you are having (i.e., a few lines of code that someone else >>> can run to get the same error). This makes it easier for others to help. >>> >>> Valerie >>> >>> On 09/06/12 04:39, d r wrote: >>> >>>> Hello >>>> >>>> I have a GRanges object representing CpG islands, that I want to >>>> subdivide >>>> so that each island will be represented by 10 ranges of equal width. >>>> >>>> To do this, I created a vector containing the widths of the islands >>>> divided >>>> by ten: >>>> >>>> swidth<-as.integer(width(**islandshg19)/10) >>>> >>>> >>>> >>>> and then called subdivideGRanges(): >>>> >>>> >>>> islandshg19<-subdivideGRanges(**islandshg19, >>>> subsize=as.integer(width/10)) >>>> >>>> >>>> and got a multitude of warnings. >>>> >>>> >>>> This approach did work when I tried it on a GRanges object containing a >>>> single range, and a width vector of length 1. When I tried to extend the >>>> sample to include two objetcs, I got these warning message: >>>> >>>> 1: In if (widths[i]< 2 * subsize) { ... : >>>> >>>> the condition has length> 1 and only the first element will be used >>>> >>>> 2: In if (width< 2 * subsize) { ... : >>>> >>>> the condition has length> 1 and only the first element will be used >>>> >>>> 3: In 0:(nchunks - 1) : numerical expression has 2 elements: only the >>>> first used >>>> >>>> >>>> >>>> Which makes me think that the problem may be related to assigning the >>>> right >>>> subszie to the the rigth range. >>>> >>>> Does anyone have an idea how to do that? >>>> >>>> Thanks in advance >>>> >>>> Dolev Rahat >>>> >>>> >>>> >>>> sessionInfo(if required): >>>> >>>> R version 2.15.1 (2012-06-22) >>>> >>>> Platform: x86_64-pc-mingw32/x64 (64-bit) >>>> >>>> >>>> >>>> locale: >>>> >>>> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >>>> States.1252 LC_MONETARY=English_United States.1252 >>>> >>>> [4] LC_NUMERIC=C LC_TIME=English_United >>>> States.1252 >>>> >>>> >>>> >>>> attached base packages: >>>> >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> >>>> >>>> other attached packages: >>>> >>>> [1] exomeCopy_1.2.0 Rsamtools_1.8.6 Biostrings_2.24.1 >>>> rtracklayer_1.16.3 taRifx_1.0.4 reshape2_1.2.1 >>>> plyr_1.7.1 >>>> >>>> [8] XML_3.9-4.1 GenomicRanges_1.9.54 IRanges_1.15.40 >>>> BiocGenerics_0.3.1 BiocInstaller_1.4.7 >>>> >>>> >>>> >>>> loaded via a namespace (and not attached): >>>> >>>> [1] bitops_1.0-4.1 BSgenome_1.25.7 RCurl_1.91-1.1 stats4_2.15.1 >>>> stringr_0.6.1 tools_2.15.1 zlibbioc_1.2.0 >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________**_________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat="" .ethz.ch="" mailman="" listinfo="" bioconductor=""> >>>> Search the archives: http://news.gmane.org/gmane.** >>>> science.biology.informatics.**conductor<http: news.gmane.org="" gma="" ne.science.biology.informatics.conductor=""> >>>> >>> >>> >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD REPLY
0
Entering edit mode
Thank you all. On Thu, Sep 6, 2012 at 11:33 PM, Martin Morgan <mtmorgan@fhcrc.org> wrote: > On 09/06/2012 11:33 AM, d r wrote: > >> Thank you Mike and Valarie, >> >> Just to be sure I understood you clearly: I understand there is no >> possibilty to use subdivideGRanges() to split a GRanges to a given number >> of elements. >> If I understood you correctly, it is also not possible to read a vector >> into the subsize argument. is taht true or is ther some possibilty to do >> that? >> > > > Not sure if this is the best way, but for a GRanges instance 'gr' dividing > it in to 'n' bins, you might start by making sure there won't be any > zero-width or necessarily overlapping bins > > > if (any(width(gr) < n)) > stop("all 'width(gr)' must be >= 'n'") > > then figure out the width, in numeric values, of each bin > > d <- width(gr) / n > > To make a vector of offsets from the start of each original bin, you want > to do this in a 'vectorized' fashion, rather than a loop, so I replicated > each element of d n times, calculated the cumulative sum, and then > subtracted for each of the d x n elements the value of the cumulative sum > at the start of each replicate, a little obscure I guess... > > dd <- cumsum(rep(d, each=n)) > mask <- logical(n); mask[1] <- TRUE # every n'th is base for range > dd <- dd - rep(dd[mask], each=n) > > To get the new starts, replicate the original starts and add the offsets, > using round() to approximate to the nearest integer value > > starts <- round(rep(start(gr), each=n) + dd) > > the ends will be the starts of the next range, minus 1 (the 'L' says to > represent 0 and 1 as integer values, rather than numeric, and avoids > unnecessary coercion from integer to numeric and back) > > ends <- c(starts[-1] - 1L, 0L) > > except for each of the n'th (including last) elements, which will be the > end of the original range > > ends[rev(mask)] <- end(gr) > > we can then replicate our original data and update the ranges > > gr <- gr[rep(seq_along(gr), each=n)] > ranges(gr) <- IRanges(starts, ends) > > getting the result > > gr > > Here's the above as a function > > intoNbins <- > function(gr, n = 10) > { > if (any(width(gr) < n)) > stop("all 'width(gr)' must be >= 'n'") > d <- width(gr) / n > dd <- cumsum(rep(d, each=n)) > mask <- logical(n); mask[1] <- TRUE > dd <- dd - rep(dd[mask], each=n) > > starts <- round(rep(start(gr), each=n) + dd) > ends <- c(starts[-1], 0) - 1L > ends[rev(mask)] <- end(gr) > > gr <- gr[rep(seq_along(gr), each=n)] > ranges(gr) <- IRanges(starts, ends) > gr > } > > and in action > > > gr <- GRanges("A", IRanges(c(1, 21, 21), width=c(10, 20, 30))) > > table(width(intoNbins(gr, 10))) > > 1 2 3 > 10 10 10 > > It might make sense to split the GRanges into a GRangesList, one element > for each CpG island, replacing the last line of intoNbins with > > split(gr, rep(seq_along(d), each=n)) > > Martin > > >> thanks again >> Dolev >> >> On Thu, Sep 6, 2012 at 5:17 PM, Michael Love <love@molgen.mpg.de> wrote: >> >> Hi Dolev, >>> >>> The subdivideGRanges() function cannot accomplish this task. >>> >>> from the manual: >>> >>> "Takes an input GRanges object and, splits each range into multiple >>> ranges >>> of nearly equal width. For an input range of width w and subdividing size >>> s, it will subdivide the range into max(1,floor(w/s)) nearly equal width >>> ranges." >>> >>> So the function tries to keep the widths of the subdivided ranges around >>> s >>> basepairs, rather than keeping the number of subdivided ranges constant. >>> >>> I will include a more helpful error message if a vector is provided for >>> the subsize argument. >>> >>> cheers, >>> >>> Mike >>> >>> >>> >>> On 09/06/12 16:02, Valerie Obenchain wrote: >>> >>> Hi Dolev, >>>> >>>> I am copying the maintainer of exomeCopy which is the package >>>> subdivideGRanges() comes from. Please provide a small reproducable >>>> example >>>> of the problem you are having (i.e., a few lines of code that someone >>>> else >>>> can run to get the same error). This makes it easier for others to help. >>>> >>>> Valerie >>>> >>>> On 09/06/12 04:39, d r wrote: >>>> >>>> Hello >>>>> >>>>> I have a GRanges object representing CpG islands, that I want to >>>>> subdivide >>>>> so that each island will be represented by 10 ranges of equal width. >>>>> >>>>> To do this, I created a vector containing the widths of the islands >>>>> divided >>>>> by ten: >>>>> >>>>> swidth<-as.integer(width(****islandshg19)/10) >>>>> >>>>> >>>>> >>>>> and then called subdivideGRanges(): >>>>> >>>>> >>>>> islandshg19<-subdivideGRanges(****islandshg19, >>>>> >>>>> subsize=as.integer(width/10)) >>>>> >>>>> >>>>> and got a multitude of warnings. >>>>> >>>>> >>>>> This approach did work when I tried it on a GRanges object containing a >>>>> single range, and a width vector of length 1. When I tried to extend >>>>> the >>>>> sample to include two objetcs, I got these warning message: >>>>> >>>>> 1: In if (widths[i]< 2 * subsize) { ... : >>>>> >>>>> the condition has length> 1 and only the first element will be >>>>> used >>>>> >>>>> 2: In if (width< 2 * subsize) { ... : >>>>> >>>>> the condition has length> 1 and only the first element will be >>>>> used >>>>> >>>>> 3: In 0:(nchunks - 1) : numerical expression has 2 elements: only the >>>>> first used >>>>> >>>>> >>>>> >>>>> Which makes me think that the problem may be related to assigning the >>>>> right >>>>> subszie to the the rigth range. >>>>> >>>>> Does anyone have an idea how to do that? >>>>> >>>>> Thanks in advance >>>>> >>>>> Dolev Rahat >>>>> >>>>> >>>>> >>>>> sessionInfo(if required): >>>>> >>>>> R version 2.15.1 (2012-06-22) >>>>> >>>>> Platform: x86_64-pc-mingw32/x64 (64-bit) >>>>> >>>>> >>>>> >>>>> locale: >>>>> >>>>> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >>>>> States.1252 LC_MONETARY=English_United States.1252 >>>>> >>>>> [4] LC_NUMERIC=C LC_TIME=English_United >>>>> States.1252 >>>>> >>>>> >>>>> >>>>> attached base packages: >>>>> >>>>> [1] stats graphics grDevices utils datasets methods base >>>>> >>>>> >>>>> >>>>> other attached packages: >>>>> >>>>> [1] exomeCopy_1.2.0 Rsamtools_1.8.6 Biostrings_2.24.1 >>>>> rtracklayer_1.16.3 taRifx_1.0.4 reshape2_1.2.1 >>>>> plyr_1.7.1 >>>>> >>>>> [8] XML_3.9-4.1 GenomicRanges_1.9.54 IRanges_1.15.40 >>>>> BiocGenerics_0.3.1 BiocInstaller_1.4.7 >>>>> >>>>> >>>>> >>>>> loaded via a namespace (and not attached): >>>>> >>>>> [1] bitops_1.0-4.1 BSgenome_1.25.7 RCurl_1.91-1.1 stats4_2.15.1 >>>>> stringr_0.6.1 tools_2.15.1 zlibbioc_1.2.0 >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________****_________________ >>>>> Bioconductor mailing list >>>>> Bioconductor@r-project.org >>>>> https://stat.ethz.ch/mailman/****listinfo/bioconductor<https: s="" tat.ethz.ch="" mailman="" **listinfo="" bioconductor=""> >>>>> <https: **="" stat.ethz.ch="" mailman="" **listinfo="" bioconductor<https:="" stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >>>>> > >>>>> Search the archives: http://news.gmane.org/gmane.** >>>>> science.biology.informatics.****conductor<http: news.gmane.**="">>>>> org/gmane.science.biology.**informatics.conductor<http: news.gm="" ane.org="" gmane.science.biology.informatics.conductor=""> >>>>> > >>>>> >>>>> >>>> >>>> >>> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> >> > > -- > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
hi Dolev, Yes this function was designed to accept a single value for 'subsize', which is the desired width of the output ranges. It is not possible to use this function to create x subdivided ranges for each range in the input. cheers Mike On 9/6/12 8:33 PM, d r wrote: > Thank you Mike and Valarie, > > Just to be sure I understood you clearly: I understand there is no > possibilty to use subdivideGRanges() to split a GRanges to a given > number of elements. > If I understood you correctly, it is also not possible to read a > vector into the subsize argument. is taht true or is ther some > possibilty to do that? > > thanks again > Dolev > > On Thu, Sep 6, 2012 at 5:17 PM, Michael Love <love@molgen.mpg.de> <mailto:love@molgen.mpg.de>> wrote: > > Hi Dolev, > > The subdivideGRanges() function cannot accomplish this task. > > from the manual: > > "Takes an input GRanges object and, splits each range into > multiple ranges of nearly equal width. For an input range of width > w and subdividing size s, it will subdivide the range into > max(1,floor(w/s)) nearly equal width ranges." > > So the function tries to keep the widths of the subdivided ranges > around s basepairs, rather than keeping the number of subdivided > ranges constant. > > I will include a more helpful error message if a vector is > provided for the subsize argument. > > cheers, > > Mike > > > > On 09/06/12 16:02, Valerie Obenchain wrote: > > Hi Dolev, > > I am copying the maintainer of exomeCopy which is the package > subdivideGRanges() comes from. Please provide a small > reproducable example of the problem you are having (i.e., a > few lines of code that someone else can run to get the same > error). This makes it easier for others to help. > > Valerie > > On 09/06/12 04:39, d r wrote: > > Hello > > I have a GRanges object representing CpG islands, that I > want to subdivide > so that each island will be represented by 10 ranges of > equal width. > > To do this, I created a vector containing the widths of > the islands divided > by ten: > > swidth<-as.integer(width(islandshg19)/10) > > > > and then called subdivideGRanges(): > > > islandshg19<-subdivideGRanges(islandshg19, > subsize=as.integer(width/10)) > > > and got a multitude of warnings. > > > This approach did work when I tried it on a GRanges object > containing a > single range, and a width vector of length 1. When I tried > to extend the > sample to include two objetcs, I got these warning message: > > 1: In if (widths[i]< 2 * subsize) { ... : > > the condition has length> 1 and only the first element > will be used > > 2: In if (width< 2 * subsize) { ... : > > the condition has length> 1 and only the first element > will be used > > 3: In 0:(nchunks - 1) : numerical expression has 2 > elements: only the first used > > > > Which makes me think that the problem may be related to > assigning the right > subszie to the the rigth range. > > Does anyone have an idea how to do that? > > Thanks in advance > > Dolev Rahat > > > > sessionInfo(if required): > > R version 2.15.1 (2012-06-22) > > Platform: x86_64-pc-mingw32/x64 (64-bit) > > > > locale: > > [1] LC_COLLATE=English_United States.1252 > LC_CTYPE=English_United > States.1252 LC_MONETARY=English_United States.1252 > > [4] LC_NUMERIC=C LC_TIME=English_United > States.1252 > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] exomeCopy_1.2.0 Rsamtools_1.8.6 Biostrings_2.24.1 > rtracklayer_1.16.3 taRifx_1.0.4 reshape2_1.2.1 > plyr_1.7.1 > > [8] XML_3.9-4.1 GenomicRanges_1.9.54 > IRanges_1.15.40 > BiocGenerics_0.3.1 BiocInstaller_1.4.7 > > > > loaded via a namespace (and not attached): > > [1] bitops_1.0-4.1 BSgenome_1.25.7 RCurl_1.91-1.1 > stats4_2.15.1 > stringr_0.6.1 tools_2.15.1 zlibbioc_1.2.0 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org <mailto:bioconductor@r-project.org> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 829 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6