Holly,
I believe that the annotation you obtained from different resources
are different versions, e.g.,mm10 from Ensemble.
I am travelling today. Jianhong will be happy to help you. If you
could keep the thread in the bioconductor list for others to
contribute/benefit, that would be very much appreciated. Thanks!
Best regards,
Julie
On 12/11/12 2:49 PM, "Holly" <xyang2@uchicago.edu> wrote:
Julie,
One more question is about how to annotation of intron peaks. I
appreciate if you could test the following example and help to figure
out how to correctly annotate it using ChIPpeakAnno.
For example, I ran the following codes based on the updated
Bioconductor packages,
data(TSS.mouse.NCBIM37)
rd <- RangedData(IRanges(start = 37377492, end= 37378857) ,
space="chr18" )
annotatePeakInBatch(rd, AnnotationData = TSS.mouse.NCBIM37)
Then I got a result as following:
RangedData with 1 row and 9 value columns across 1 space
space ranges | peak
strand
<factor> <iranges> | <character>
<character>
1 ENSMUSG00000073593 18 [37377492, 37378857] | 1
-
feature start_position end_position
<character> <numeric> <numeric>
1 ENSMUSG00000073593 ENSMUSG00000073593 37319509 37338176
insideFeature distancetoFeature shortestDistance
<character> <numeric> <numeric>
1 ENSMUSG00000073593 upstream -39316 39316
fromOverlappingOrNearest
<character>
1 ENSMUSG00000073593 NearestStart
However, on GenomeBrowser
http://genome.ucsc.edu/cgi-bin/hgTracks
(MCBI37/mm9), it is an intron region of gene Pcdha4-9.
While if I am trying:
mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl")
getAnnotation(mart, featureType="TSS")
annotatePeakInBatch(rd, AnnotationData = Annotation)
it gives a totally different results as ENSMUSG00000051242 which is
also not as I expected.
sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-pc-linux-gnu (64-bit)
attached base packages:
[1] grid stats graphics grDevices utils datasets
methods
[8] base
other attached packages:
[1] org.Mm.eg.db_2.8.0 ChIPpeakAnno_2.6.0
[3] limma_3.14.3 org.Hs.eg.db_2.8.0
[5] GO.db_2.8.0 RSQLite_0.11.2
[7] DBI_0.2-5
BSgenome.Ecoli.NCBI.20080805_1.3.17
[9] BSgenome_1.26.1 Biostrings_2.26.2
[11] multtest_2.14.0 biomaRt_2.14.0
[13] VennDiagram_1.5.1 BayesPeak_1.10.0
[15] rtracklayer_1.18.1 GenomicFeatures_1.10.1
[17] AnnotationDbi_1.20.3 Biobase_2.18.0
[19] GenomicRanges_1.10.5 IRanges_1.16.4
[21] BiocGenerics_0.4.0 BiocInstaller_1.8.3
loaded via a namespace (and not attached):
[1] bitops_1.0-5 MASS_7.3-22 parallel_2.15.2 RCurl_1.95-3
[5] Rsamtools_1.10.2 splines_2.15.2 stats4_2.15.2
survival_2.37-2
[9] tools_2.15.2 XML_3.95-0.1 zlibbioc_1.4.0
Thanks again,
Holly
On 12/10/2012 01:10 PM, Zhu, Lihua (Julie) wrote:
Holly,
Thanks for the link! The BDPs in ChIPpeakAnno is defined purely
according to
the coordinates of known genes.
Best regards,
Julie
On 12/10/12 1:30 PM, "Holly" <xyang2@uchicago.edu>
<mailto:xyang2@uchicago.edu> wrote:
Julie,
A basic question to verify your definition of the bi-directional
promoters is,
did you define them purely according to the coordinates of known
genes, or,
have you referred to the experimental data, e.g. EST experiments done
by
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1853124/ ?
I learned a lot from the discussion with you. Thanks again,
Holly
On 12/10/2012 10:46 AM, Zhu, Lihua (Julie) wrote:
Dear Holly,
I believe that you are interested in finding the peaks that reside in
bi-directional promoters. If so, you can use the following functions
in
ChIPpeakAnno.
BDP = peaksNearBDP(peaks, AnnotationData=TSS, MaxDistance =5000)
c(BDP$percentPeaksWithBDP, BDP$n.peaksWithBDP, BDP$n.peaks)
all.genes = union(annotated.peaks$feature, BDP$peaksWithBDP$feature)
where annotated.peaks is generated from annotatePeakInBatch using TSS.
To
learn more about peaksNearBDP, please type ?peaksNearBDP in R.
If you just want to find genes on both side of the peaks within
certain
distance away from the peaks, you can use the following command.
Annotated.peaks = annotatePeakInBatch(peaks, AnnotationData = TSS,
output="both",select="all", maxgap=1000000)
Where maxgap can be adjusted according to your needs.
Please let me know if this suits your needs. Thanks!
Best regards,
Julie
On 12/10/12 11:19 AM, "Holly" <xyang2@uchicago.edu>
<mailto:xyang2@uchicago.edu> wrote:
Dear Lihua,
I am trying to annotate peaks for not only the genes with the nearest
TSS but the ones at the other side of the peaks.
Do you think I can use ChIPpeakAnno to get both sided genes for a peak
region? If so, what do you suggest?
Thanks a lot,
Holly
[[alternative HTML version deleted]]