I am trying to perfom Chip-Seq Analysis which is greatly described in that conversation: http://biology.stackexchange.com/questions/16701/introduction-to-chip-seq . The same as autor of mentioned question I am a student of Applied Mathematics and I am doing my best to enter bioinformatics fields.
I am basically at the moment where I would like to annotate peaks with genes' IDs. To do this I am using [ChIPpeakAnno](http://bioconductor.org/packages/release/bioc/vignettes/ChIPpeakAnno/inst/doc/ChIPpeakAnno.pdf) `R` package from [Bioconductor](http://bioconductor.org/packages/release/bioc/html/ChIPpeakAnno.html). I used below code to annotate my reads (outputs from `MACS`) with genes' ID's
> library(ChIPpeakAnno)
> # data(package = "ChIPpeakAnno")$results[,3]
> macsOutput <- toGRanges(data="example_peaks.bed",
+ format = "MACS")
> data(TSS.human.GRCh37)
> macs.anno <- annotatePeakInBatch(macsOutput,
+ AnnotationData=TSS.human.GRCh37,
+ output="overlapping", maxgap=5000L)
> library(org.Hs.eg.db)
> macs.anno <- addGeneIDs(annotatedPeak=macs.anno,
+ orgAnn="org.Hs.eg.db",
+ IDs2Add="symbol")
>
> # no annotations for some genes
> as.character(head(as.data.frame(macs.anno)$symbol))
[1] "PTCHD2" "PTCHD2" "PTGER3" NA "HFM1" NA
**but there occures that there are no genes' annotations for some peaks**. Can anyone tell me why this might happen? And how to avoid this? Does this refer to the `maxgap=5000L` parameter? When creating output from `MACS` I set a parameter for length to be 10 000.