Use ChIPpeakAnno to find two-sided nearest genes to a peak
3
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 3 months ago
United States
Dear Holly, I believe that you are interested in finding the peaks that reside in bi-directional promoters. If so, you can use the following functions in ChIPpeakAnno. BDP = peaksNearBDP(peaks, AnnotationData=TSS, MaxDistance =5000) c(BDP$percentPeaksWithBDP, BDP$n.peaksWithBDP, BDP$n.peaks) all.genes = union(annotated.peaks$feature, BDP$peaksWithBDP$feature) where annotated.peaks is generated from annotatePeakInBatch using TSS. To learn more about peaksNearBDP, please type ?peaksNearBDP in R. If you just want to find genes on both side of the peaks within certain distance away from the peaks, you can use the following command. Annotated.peaks = annotatePeakInBatch(peaks, AnnotationData = TSS, output="both",select="all", maxgap=1000000) Where maxgap can be adjusted according to your needs. Please let me know if this suits your needs. Thanks! Best regards, Julie On 12/10/12 11:19 AM, "Holly" <xyang2 at="" uchicago.edu=""> wrote: > Dear Lihua, > > I am trying to annotate peaks for not only the genes with the nearest > TSS but the ones at the other side of the peaks. > Do you think I can use ChIPpeakAnno to get both sided genes for a peak > region? If so, what do you suggest? > Thanks a lot, > > Holly
0
Entering edit mode
xyang2 ▴ 120
@xyang2-4387
Last seen 11 months ago
Julie, You are much helpful. Thanks! I will forward it to people who have similar question. Holly On 12/10/2012 10:46 AM, Zhu, Lihua (Julie) wrote: > Dear Holly, > > I believe that you are interested in finding the peaks that reside in > bi-directional promoters. If so, you can use the following functions in > ChIPpeakAnno. > > BDP = peaksNearBDP(peaks, AnnotationData=TSS, MaxDistance =5000) > c(BDP$percentPeaksWithBDP, BDP$n.peaksWithBDP, BDP$n.peaks) > all.genes = union(annotated.peaks$feature, BDP$peaksWithBDP$feature) > where annotated.peaks is generated from annotatePeakInBatch using TSS. To > learn more about peaksNearBDP, please type ?peaksNearBDP in R. > > If you just want to find genes on both side of the peaks within certain > distance away from the peaks, you can use the following command. > > Annotated.peaks = annotatePeakInBatch(peaks, AnnotationData = TSS, > output="both",select="all", maxgap=1000000) > Where maxgap can be adjusted according to your needs. > > Please let me know if this suits your needs. Thanks! > > Best regards, > > Julie > > > > On 12/10/12 11:19 AM, "Holly" <xyang2 at="" uchicago.edu=""> wrote: > >> Dear Lihua, >> >> I am trying to annotate peaks for not only the genes with the nearest >> TSS but the ones at the other side of the peaks. >> Do you think I can use ChIPpeakAnno to get both sided genes for a peak >> region? If so, what do you suggest? >> Thanks a lot, >> >> Holly
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 3 months ago
United States
Holly, Thanks for the link! The BDPs in ChIPpeakAnno is defined purely according to the coordinates of known genes. Best regards, Julie On 12/10/12 1:30 PM, "Holly" <xyang2 at="" uchicago.edu=""> wrote: > Julie, > > A basic question to verify your definition of the bi-directional promoters is, > did you define them purely according to the coordinates of known genes, or, > have you referred to the experimental data, e.g. EST experiments done by > http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1853124/ ? > > I learned a lot from the discussion with you. Thanks again, > > Holly > > > > On 12/10/2012 10:46 AM, Zhu, Lihua (Julie) wrote: >> Dear Holly, >> >> I believe that you are interested in finding the peaks that reside in >> bi-directional promoters. If so, you can use the following functions in >> ChIPpeakAnno. >> >> BDP = peaksNearBDP(peaks, AnnotationData=TSS, MaxDistance =5000) >> c(BDP$percentPeaksWithBDP, BDP$n.peaksWithBDP, BDP$n.peaks) >> all.genes = union(annotated.peaks$feature, BDP$peaksWithBDP$feature) >> where annotated.peaks is generated from annotatePeakInBatch using TSS. To >> learn more about peaksNearBDP, please type ?peaksNearBDP in R. >> >> If you just want to find genes on both side of the peaks within certain >> distance away from the peaks, you can use the following command. >> >> Annotated.peaks = annotatePeakInBatch(peaks, AnnotationData = TSS, >> output="both",select="all", maxgap=1000000) >> Where maxgap can be adjusted according to your needs. >> >> Please let me know if this suits your needs. Thanks! >> >> Best regards, >> >> Julie >> >> >> >> On 12/10/12 11:19 AM, "Holly" <xyang2 at="" uchicago.edu=""> wrote: >> >>> Dear Lihua, >>> >>> I am trying to annotate peaks for not only the genes with the nearest >>> TSS but the ones at the other side of the peaks. >>> Do you think I can use ChIPpeakAnno to get both sided genes for a peak >>> region? If so, what do you suggest? >>> Thanks a lot, >>> >>> Holly >
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 3 months ago
United States
Holly, I believe that the annotation you obtained from different resources are different versions, e.g.,mm10 from Ensemble. I am travelling today. Jianhong will be happy to help you. If you could keep the thread in the bioconductor list for others to contribute/benefit, that would be very much appreciated. Thanks! Best regards, Julie On 12/11/12 2:49 PM, "Holly" <xyang2@uchicago.edu> wrote: Julie, One more question is about how to annotation of intron peaks. I appreciate if you could test the following example and help to figure out how to correctly annotate it using ChIPpeakAnno. For example, I ran the following codes based on the updated Bioconductor packages, data(TSS.mouse.NCBIM37) rd <- RangedData(IRanges(start = 37377492, end= 37378857) , space="chr18" ) annotatePeakInBatch(rd, AnnotationData = TSS.mouse.NCBIM37) Then I got a result as following: RangedData with 1 row and 9 value columns across 1 space space ranges | peak strand <factor> <iranges> | <character> <character> 1 ENSMUSG00000073593 18 [37377492, 37378857] | 1 - feature start_position end_position <character> <numeric> <numeric> 1 ENSMUSG00000073593 ENSMUSG00000073593 37319509 37338176 insideFeature distancetoFeature shortestDistance <character> <numeric> <numeric> 1 ENSMUSG00000073593 upstream -39316 39316 fromOverlappingOrNearest <character> 1 ENSMUSG00000073593 NearestStart However, on GenomeBrowser http://genome.ucsc.edu/cgi-bin/hgTracks (MCBI37/mm9), it is an intron region of gene Pcdha4-9. While if I am trying: mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl") getAnnotation(mart, featureType="TSS") annotatePeakInBatch(rd, AnnotationData = Annotation) it gives a totally different results as ENSMUSG00000051242 which is also not as I expected. sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-pc-linux-gnu (64-bit) attached base packages: [1] grid stats graphics grDevices utils datasets methods [8] base other attached packages: [1] org.Mm.eg.db_2.8.0 ChIPpeakAnno_2.6.0 [3] limma_3.14.3 org.Hs.eg.db_2.8.0 [5] GO.db_2.8.0 RSQLite_0.11.2 [7] DBI_0.2-5 BSgenome.Ecoli.NCBI.20080805_1.3.17 [9] BSgenome_1.26.1 Biostrings_2.26.2 [11] multtest_2.14.0 biomaRt_2.14.0 [13] VennDiagram_1.5.1 BayesPeak_1.10.0 [15] rtracklayer_1.18.1 GenomicFeatures_1.10.1 [17] AnnotationDbi_1.20.3 Biobase_2.18.0 [19] GenomicRanges_1.10.5 IRanges_1.16.4 [21] BiocGenerics_0.4.0 BiocInstaller_1.8.3 loaded via a namespace (and not attached): [1] bitops_1.0-5 MASS_7.3-22 parallel_2.15.2 RCurl_1.95-3 [5] Rsamtools_1.10.2 splines_2.15.2 stats4_2.15.2 survival_2.37-2 [9] tools_2.15.2 XML_3.95-0.1 zlibbioc_1.4.0 Thanks again, Holly On 12/10/2012 01:10 PM, Zhu, Lihua (Julie) wrote: Holly, Thanks for the link! The BDPs in ChIPpeakAnno is defined purely according to the coordinates of known genes. Best regards, Julie On 12/10/12 1:30 PM, "Holly" <xyang2@uchicago.edu> <mailto:xyang2@uchicago.edu> wrote: Julie, A basic question to verify your definition of the bi-directional promoters is, did you define them purely according to the coordinates of known genes, or, have you referred to the experimental data, e.g. EST experiments done by http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1853124/ ? I learned a lot from the discussion with you. Thanks again, Holly On 12/10/2012 10:46 AM, Zhu, Lihua (Julie) wrote: Dear Holly, I believe that you are interested in finding the peaks that reside in bi-directional promoters. If so, you can use the following functions in ChIPpeakAnno. BDP = peaksNearBDP(peaks, AnnotationData=TSS, MaxDistance =5000) c(BDP$percentPeaksWithBDP, BDP$n.peaksWithBDP, BDP$n.peaks) all.genes = union(annotated.peaks$feature, BDP$peaksWithBDP$feature) where annotated.peaks is generated from annotatePeakInBatch using TSS. To learn more about peaksNearBDP, please type ?peaksNearBDP in R. If you just want to find genes on both side of the peaks within certain distance away from the peaks, you can use the following command. Annotated.peaks = annotatePeakInBatch(peaks, AnnotationData = TSS, output="both",select="all", maxgap=1000000) Where maxgap can be adjusted according to your needs. Please let me know if this suits your needs. Thanks! Best regards, Julie On 12/10/12 11:19 AM, "Holly" <xyang2@uchicago.edu> <mailto:xyang2@uchicago.edu> wrote: Dear Lihua, I am trying to annotate peaks for not only the genes with the nearest TSS but the ones at the other side of the peaks. Do you think I can use ChIPpeakAnno to get both sided genes for a peak region? If so, what do you suggest? Thanks a lot, Holly [[alternative HTML version deleted]]