Question: How to do functional enrichment analysis for only promoter regions using ChIPseeker
6 months ago
shikhtechai0 wrote:

I am trying to use ChIPseeker to annotate and do functional enrichment analysis for H3K4me3 peaks. I am a bit confused using seq2gene function. How can I only select those genes which has H3K4me3 peak in +/- 2000bp? From the tutorial, in the arguments, there's option for "tssRegion = c(-1000, 1000), flankDistance = 3000". How can I configure this to get only those genes which has H3K4me3 peaks in TSS +/- 2000bp?

Thank you in advance for your help!

modified 5 months ago by Guangchuang Yu

To clarify a bit, I wanted to do functional pathway analysis for my H3K4me3 peaks. Now, I want to take only the peaks that are present within +/-2000bp from TSS of genes. How can I select only those genes? I have followed your nicely compiled ChIPseeker tutorial page. But It does the pathway analysis with all the genes, no matter how far the peaks are from TSS, if I am not mistaken. But I want to do the pathway analsis with only those genes that have the peak present within +/-2000bp from their TSS.

Answer: How to do functional enrichment analysis for only promoter regions using ChIPsee
1
5 months ago
Guangchuang Yu1.1k
China/Guangzhou/Southern Medical University
Guangchuang Yu1.1k wrote:
> f = getSampleFiles()[[4]] > x = annotatePeak(f, tssRegion=c(-2000, 2000)) >> loading peak file... 2018-09-12 11:19:29 AM >> preparing features information... 2018-09-12 11:19:29 AM >> identifying nearest features... 2018-09-12 11:19:29 AM >> calculating distance from peak to TSS... 2018-09-12 11:19:29 AM >> assigning genomic annotation... 2018-09-12 11:19:29 AM >> assigning chromosome lengths 2018-09-12 11:19:47 AM >> done... 2018-09-12 11:19:47 AM Warning message: In loadTxDb(TxDb) : >> TxDb is not specified, use 'TxDb.Hsapiens.UCSC.hg19.knownGene' by default... > y = as.data.frame(x) > head(y, 2) seqnames start end width strand V4 V5 annotation1 chr1 815093 817883 2791 * MACS_peak_1 295.76 Distal Intergenic2 chr1 1243288 1244338 1051 * MACS_peak_2 63.19 Promoter (<=1kb) geneChr geneStart geneEnd geneLength geneStrand geneId transcriptId1 1 803451 812182 8732 2 284593 uc001abt.42 1 1243994 1247057 3064 1 126789 uc001aed.3 distanceToTSS1 -29112 0 > y$geneId[grep("Promoter", y$annotation)] -> genes > head(genes) [1] "126789" "440556" "49856" "100133612" "390992" "8672" > Here is an example, and you can use the output genes to perform enrichment analysis using clusterProfiler package.