Question: IRanges problem: findOverlaps
0
gravatar for Nicolas DESCOSTES
7.4 years ago by
Nicolas DESCOSTES20 wrote:
Dear members, I have a set of chip-Seq peaks which I want to find the overlap with a bench of annotations downloaded from ucsc. When I am doing the overlap between my peaks and annotations list, I am finding 5635 positive matches. When adding "type = "start"", I have no results returned. However, when I am visualizing the 5635 intervals, it overlaps many start site of my genes annotations. I tried to update IRanges but I am still getting no results. Any idea? Thanks. [[alternative HTML version deleted]]
iranges • 681 views
ADD COMMENTlink modified 7.4 years ago by Julien Gagneur50 • written 7.4 years ago by Nicolas DESCOSTES20
Answer: IRanges problem: findOverlaps
0
gravatar for Jonathan Cairns
7.4 years ago by
Jonathan Cairns130 wrote:
Hi, >From ?findOverlaps: "If ?type? is ?start? or ?end?, the intervals are required to have matching starts or ends, respectively." This doesn't sound like it's what you're after, as type = "start" considers only the starts of peaks, as well as the starts of whatever annotation you're using. Have you looked at the package ChIPpeakAnno? It is designed for annotating ChIP-seq peaks and might be useful. Jonathan ________________________________________ From: bioconductor-bounces@r-project.org [bioconductor- bounces@r-project.org] On Behalf Of Nicolas DESCOSTES [descostes@ciml .univ-mrs.fr] Sent: 10 July 2012 16:40 To: bioconductor at r-project.org Subject: [BioC] IRanges problem: findOverlaps Dear members, I have a set of chip-Seq peaks which I want to find the overlap with a bench of annotations downloaded from ucsc. When I am doing the overlap between my peaks and annotations list, I am finding 5635 positive matches. When adding "type = "start"", I have no results returned. However, when I am visualizing the 5635 intervals, it overlaps many start site of my genes annotations. I tried to update IRanges but I am still getting no results. Any idea? Thanks. [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor NOTICE AND DISCLAIMER This e-mail (including any attachments) is intended for ...{{dropped:16}}
ADD COMMENTlink written 7.4 years ago by Jonathan Cairns130
how about resize(genes,1, fix='start') findOverlaps(peaks, genes) however, +1 for Julie Zhu's ChIPpeakAnno package since, as was just pointed out, it is meant for such things! On Tue, Jul 10, 2012 at 9:43 AM, Jonathan Cairns <jonathan.cairns at="" cancer.org.uk=""> wrote: > Hi, > > >From ?findOverlaps: > > "If ?type? is ?start? or ?end?, the intervals are required to have matching starts or ends, respectively." > > This doesn't sound like it's what you're after, as type = "start" considers only the starts of peaks, as well as the starts of whatever annotation you're using. > > Have you looked at the package ChIPpeakAnno? It is designed for annotating ChIP-seq peaks and might be useful. > > Jonathan > > ________________________________________ > From: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] On Behalf Of Nicolas DESCOSTES [descostes at ciml.univ- mrs.fr] > Sent: 10 July 2012 16:40 > To: bioconductor at r-project.org > Subject: [BioC] IRanges problem: findOverlaps > > Dear members, > > I have a set of chip-Seq peaks which I want to find the overlap with a bench of annotations downloaded from ucsc. > > When I am doing the overlap between my peaks and annotations list, I am finding 5635 positive matches. When adding "type = "start"", I have no results returned. However, when I am visualizing the 5635 intervals, it overlaps many start site of my genes annotations. > > I tried to update IRanges but I am still getting no results. > > Any idea? > > Thanks. > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > NOTICE AND DISCLAIMER > This e-mail (including any attachments) is intended for ...{{dropped:16}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- A model is a lie that helps you see the truth. Howard Skipper
ADD REPLYlink written 7.4 years ago by Tim Triche4.2k
Thanks all for your reply. Just tried ChIPpeakAnno, it is doing well. Cheers. -----Original Message----- From: Tim Triche, Jr. [mailto:tim.triche@gmail.com] Sent: Tuesday, July 10, 2012 6:47 PM To: Jonathan Cairns Cc: Nicolas DESCOSTES; bioconductor at r-project.org Subject: Re: [BioC] IRanges problem: findOverlaps how about resize(genes,1, fix='start') findOverlaps(peaks, genes) however, +1 for Julie Zhu's ChIPpeakAnno package since, as was just pointed out, it is meant for such things! On Tue, Jul 10, 2012 at 9:43 AM, Jonathan Cairns <jonathan.cairns at="" cancer.org.uk=""> wrote: > Hi, > > >From ?findOverlaps: > > "If 'type' is 'start' or 'end', the intervals are required to have matching starts or ends, respectively." > > This doesn't sound like it's what you're after, as type = "start" considers only the starts of peaks, as well as the starts of whatever annotation you're using. > > Have you looked at the package ChIPpeakAnno? It is designed for annotating ChIP-seq peaks and might be useful. > > Jonathan > > ________________________________________ > From: bioconductor-bounces at r-project.org > [bioconductor-bounces at r-project.org] On Behalf Of Nicolas DESCOSTES > [descostes at ciml.univ-mrs.fr] > Sent: 10 July 2012 16:40 > To: bioconductor at r-project.org > Subject: [BioC] IRanges problem: findOverlaps > > Dear members, > > I have a set of chip-Seq peaks which I want to find the overlap with a bench of annotations downloaded from ucsc. > > When I am doing the overlap between my peaks and annotations list, I am finding 5635 positive matches. When adding "type = "start"", I have no results returned. However, when I am visualizing the 5635 intervals, it overlaps many start site of my genes annotations. > > I tried to update IRanges but I am still getting no results. > > Any idea? > > Thanks. > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > NOTICE AND DISCLAIMER > This e-mail (including any attachments) is intended for > ...{{dropped:16}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- A model is a lie that helps you see the truth. Howard Skipper
ADD REPLYlink written 7.4 years ago by Nicolas DESCOSTES20
Answer: IRanges problem: findOverlaps
0
gravatar for alessandro brozzi
7.4 years ago by
European Union
alessandro brozzi120 wrote:
hi Nicolas, I don't know very well Iranges but to find overlapping intervals the package "intervals" http://cran.fhcrc.org/web/packages/intervals/index.html is very straightforward. Here an example: peaks = matrix( c( 2, 8, 8, 9, 6, 9, 11, 12, 3, 3 ), ncol = 2, byrow = TRUE ) > peaks [,1] [,2] [1,] 2 8 [2,] 8 9 [3,] 6 9 [4,] 11 12 [5,] 3 3 > track = matrix( c( 2, 8, 3, 4, 5, 10 ), ncol = 2, byrow = TRUE ) track [,1] [,2] [1,] 2 8 [2,] 3 4 [3,] 5 10 interval_overlap ( Intervals(peaks), Intervals(track)) [[1]] [1] 1 2 3 [[2]] [1] 1 3 [[3]] [1] 1 3 [[4]] integer(0) [[5]] [1] 1 2 the result is a list: for each peak you have the indexes of the corresponding overlapping items of the track matrix. Setting some options you can fine tune your research. HTH, Alex On Tue, Jul 10, 2012 at 5:40 PM, Nicolas DESCOSTES < descostes@ciml.univ-mrs.fr> wrote: > Dear members, > > I have a set of chip-Seq peaks which I want to find the overlap with a > bench of annotations downloaded from ucsc. > > When I am doing the overlap between my peaks and annotations list, I am > finding 5635 positive matches. When adding "type = "start"", I have no > results returned. However, when I am visualizing the 5635 intervals, it > overlaps many start site of my genes annotations. > > I tried to update IRanges but I am still getting no results. > > Any idea? > > Thanks. > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 7.4 years ago by alessandro brozzi120
Hi Alex, I am not sure interval_overlap() is better option than findOverlaps(). Is interval_overlap() aware that intervals may lie in different chromosomes? In other words, every peak has three coordinates: chromosome, start and end. The findOverlaps() function is aware of that. Thank you, Ivan Ivan Gregoretti, PhD On Tue, Jul 10, 2012 at 12:05 PM, alessandro brozzi <alessandro.brozzi at="" gmail.com=""> wrote: > hi Nicolas, > > I don't know very well Iranges but to find overlapping intervals the > package "intervals" > > http://cran.fhcrc.org/web/packages/intervals/index.html > > is very straightforward. > > Here an example: > > peaks = matrix( > c( > 2, 8, > 8, 9, > 6, 9, > 11, 12, > 3, 3 > ), > ncol = 2, byrow = TRUE > ) > >> peaks > [,1] [,2] > [1,] 2 8 > [2,] 8 9 > [3,] 6 9 > [4,] 11 12 > [5,] 3 3 >> > > track = matrix( > c( > 2, 8, > 3, 4, > 5, 10 > ), > ncol = 2, byrow = TRUE > ) > > track > [,1] [,2] > [1,] 2 8 > [2,] 3 4 > [3,] 5 10 > > interval_overlap ( Intervals(peaks), Intervals(track)) > > [[1]] > [1] 1 2 3 > > [[2]] > [1] 1 3 > > [[3]] > [1] 1 3 > > [[4]] > integer(0) > > [[5]] > [1] 1 2 > > the result is a list: for each peak you have the indexes of the > corresponding overlapping items of the track matrix. Setting some options > you can fine tune your research. > > HTH, > Alex > > > On Tue, Jul 10, 2012 at 5:40 PM, Nicolas DESCOSTES < > descostes at ciml.univ-mrs.fr> wrote: > >> Dear members, >> >> I have a set of chip-Seq peaks which I want to find the overlap with a >> bench of annotations downloaded from ucsc. >> >> When I am doing the overlap between my peaks and annotations list, I am >> finding 5635 positive matches. When adding "type = "start"", I have no >> results returned. However, when I am visualizing the 5635 intervals, it >> overlaps many start site of my genes annotations. >> >> I tried to update IRanges but I am still getting no results. >> >> Any idea? >> >> Thanks. >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLYlink written 7.4 years ago by Ivan Gregoretti310
Hi all, I'm not sure about intervals, but genomeIntervals is chromosome-aware (http://bioconductor.org/packages/release/bioc/html/genomeIntervals.ht ml). The syntax is as described by Alex, since genomeIntervals extends intervals. Whether you use genomeIntervals or IRanges is just a matter of taste, really. Cheer, Nico --------------------------------------------------------------- Nicolas Delhomme Genome Biology Computational Support European Molecular Biology Laboratory Tel: +49 6221 387 8310 Email: nicolas.delhomme at embl.de Meyerhofstrasse 1 - Postfach 10.2209 69102 Heidelberg, Germany --------------------------------------------------------------- On Jul 10, 2012, at 6:34 PM, Ivan Gregoretti wrote: > Hi Alex, > > I am not sure interval_overlap() is better option than findOverlaps(). > > Is interval_overlap() aware that intervals may lie in different chromosomes? > > In other words, every peak has three coordinates: chromosome, start > and end. The findOverlaps() function is aware of that. > > Thank you, > > Ivan > > > Ivan Gregoretti, PhD > > > On Tue, Jul 10, 2012 at 12:05 PM, alessandro brozzi > <alessandro.brozzi at="" gmail.com=""> wrote: >> hi Nicolas, >> >> I don't know very well Iranges but to find overlapping intervals the >> package "intervals" >> >> http://cran.fhcrc.org/web/packages/intervals/index.html >> >> is very straightforward. >> >> Here an example: >> >> peaks = matrix( >> c( >> 2, 8, >> 8, 9, >> 6, 9, >> 11, 12, >> 3, 3 >> ), >> ncol = 2, byrow = TRUE >> ) >> >>> peaks >> [,1] [,2] >> [1,] 2 8 >> [2,] 8 9 >> [3,] 6 9 >> [4,] 11 12 >> [5,] 3 3 >>> >> >> track = matrix( >> c( >> 2, 8, >> 3, 4, >> 5, 10 >> ), >> ncol = 2, byrow = TRUE >> ) >> >> track >> [,1] [,2] >> [1,] 2 8 >> [2,] 3 4 >> [3,] 5 10 >> >> interval_overlap ( Intervals(peaks), Intervals(track)) >> >> [[1]] >> [1] 1 2 3 >> >> [[2]] >> [1] 1 3 >> >> [[3]] >> [1] 1 3 >> >> [[4]] >> integer(0) >> >> [[5]] >> [1] 1 2 >> >> the result is a list: for each peak you have the indexes of the >> corresponding overlapping items of the track matrix. Setting some options >> you can fine tune your research. >> >> HTH, >> Alex >> >> >> On Tue, Jul 10, 2012 at 5:40 PM, Nicolas DESCOSTES < >> descostes at ciml.univ-mrs.fr> wrote: >> >>> Dear members, >>> >>> I have a set of chip-Seq peaks which I want to find the overlap with a >>> bench of annotations downloaded from ucsc. >>> >>> When I am doing the overlap between my peaks and annotations list, I am >>> finding 5635 positive matches. When adding "type = "start"", I have no >>> results returned. However, when I am visualizing the 5635 intervals, it >>> overlaps many start site of my genes annotations. >>> >>> I tried to update IRanges but I am still getting no results. >>> >>> Any idea? >>> >>> Thanks. >>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLYlink written 7.4 years ago by delhomme@embl.de1.2k
Answer: IRanges problem: findOverlaps
0
gravatar for Julien Gagneur
7.4 years ago by
Julien Gagneur50 wrote:
Indeed, Nico. interval_overlap() can moreover be strand-specific (on Genome_intervals_stranded objects). It also deals with so-called "inter-base" positions, i.e positions between two nucleotides to represent, for example, insertion points or restriction enzyme cutting sites. One thus can ask whether cutting sites occurs within a set of exons without having to write extra code to deal with cutting sites right at exon boundaries. Best, Julien http://www.gagneur.genzentrum.lmu.de/
ADD COMMENTlink written 7.4 years ago by Julien Gagneur50
Just for the record, findOverlaps on GenomicRanges objects is strand-specific. In theory, insertion points could be internally represented with ranges where end = start - 1. Then we could have a higher level class that makes that more user friendly. Haven't had a use case yet, though. Michael On Wed, Jul 11, 2012 at 3:50 AM, Julien Gagneur <julien.gagneur@embl.de>wrote: > Indeed, Nico. > > interval_overlap() can moreover be strand-specific (on > Genome_intervals_stranded objects). It also deals with so-called > "inter-base" positions, i.e positions between two nucleotides to represent, > for example, insertion points or restriction enzyme cutting sites. One thus > can ask whether cutting sites occurs within a set of exons without having > to write extra code to deal with cutting sites right at exon boundaries. > > Best, > > Julien > > http://www.gagneur.genzentrum.lmu.de/ > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLYlink written 7.4 years ago by Michael Lawrence11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 154 users visited in the last hour