Search
Question: ChIPpeakAnno overlap peaks with TSS returns more than TSS overlap
0
gravatar for 94133
4 weeks ago by
941330
USA, Stanford
941330 wrote:

I want ChIP peaks that overlap gene TSSs. However, output from ChIPpeakAnno returns peaks that do not overlap, which requires extra filtering. Is there a better way?

ChIP_peaks_annoTSS <- annotatePeakInBatch(res1_ChIP,
                                            AnnotationData = genes(TxDb.Mmusculus.UCSC.mm10.knownGene),
                                            output = "overlapping",
                                            featureType = "TSS",
                                            select = "all",
                                            ignore.strand = TRUE,
                                            FeatureLocForDistance = "TSS")
ChIP_peaks_annoTSS <- addGeneIDs(annotatedPeak=ChIP_peaks_annoTSS,
                                   orgAnn = "org.Mm.eg.db",
                                   feature_id_type = "entrez_id",
                                   IDs2Add = "symbol") %>% as.data.frame()

fromOverlappingOrNearest column = Overlapping, when insideFeature shows inside or overlapEnd, which is NOT TSS.

So then, I filter from insideFeature column to get TSS overlaps, like:

TSSpatterns = c("overlapStart","includeFeature")
ChIP_peaks_annoTSS <- filter(ChIP_peaks_annoTSS, grepl(paste(TSSpatterns, collapse="|"), insideFeature))
ChIP_peaks_annoTSS_cond <- condenseMatrixByColnames(as.matrix(as.data.frame(ChIP_peaks_annoTSS)), "peak")

Can you show me the proper way?

Thanks!!!!

 

 

ADD COMMENTlink modified 4 weeks ago by Ou, Jianhong1.1k • written 4 weeks ago by 941330

Could you please try the following code and see if that meets your need? Thanks!

tss <- promoters(TxDb.Mmusculus.UCSC.mm10.knownGene, upstream=0, downstream=1)

ChIP_peaks_annoTSS <- annotatePeakInBatch(res1_ChIP,
                                            AnnotationData = tss,
                                            output = "overlapping",
                                            featureType = "TSS",
                                            select = "all",
                                            ignore.strand = TRUE,
                                            FeatureLocForDistance = "TSS")

Best regards,

Julie

ADD REPLYlink written 4 weeks ago by Julie Zhu3.8k
0
gravatar for Ou, Jianhong
4 weeks ago by
Ou, Jianhong1.1k
United States
Ou, Jianhong1.1k wrote:

Did you tried to set output = "upstream"?

ADD COMMENTlink written 4 weeks ago by Ou, Jianhong1.1k

No. Are you suggesting this is the best way to do this? I don't understand why one would use upstream for TSS overlap, can you explain?

Thanks!

ADD REPLYlink written 4 weeks ago by 941330

This will find the peaks overlap with the TSS because we set the maxgap=-1 and FeatureLocForDistance="TSS". 

However, maybe this is not the answer of your biological question. Maybe you are asking to find the annotation for promoter region? If that is the case, please try to use set output="overlapping", FeatureLocForDistance="TSS" and bindingRegion = c(-5000, 3000). Here the bindingRegion means upstream 5K and downstream 3K of TSS.

ADD REPLYlink written 4 weeks ago by Ou, Jianhong1.1k

I tried your suggestion like this but get an error:

ChIP_peaks_annoTSS <- annotatePeakInBatch(res1_ChIP,

                                            AnnotationData = genes(TxDb.Mmusculus.UCSC.mm10.knownGene),
                                            output = "overlapping",
                                            featureType = "TSS",
                                            select = "all",
                                            ignore.strand = TRUE,
                                            FeatureLocForDistance = "TSS",
                                            bindingRegion = c(-2000, 2000))
ChIP_peaks_annoTSS <- addGeneIDs(annotatedPeak=ChIP_peaks_annoTSS,
                                   orgAnn = "org.Mm.eg.db",
                                   feature_id_type = "entrez_id",
                                   IDs2Add = "symbol") 
ChIP_peaks_annoTSS_cond <- condenseMatrixByColnames(as.matrix(as.data.frame(ChIP_peaks_annoTSS)), "peak")

Error in data.frame(seqnames = as.factor(seqnames(x)), start = start(x),  : 
  duplicate row.names: X12, X39, X45, X52, X67, X71, X137, X144, X179, X184, X215, X228, X232, X240, X244, X246, X255, X262, X265, X284, X287, X379, X384, X391, X393, X404, X420, X451, X533, X534, X536, X553, X556, X574, X575, X60 ... ... ... 

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by 941330

try:

ChIP_peaks_annoTSS_cond <- condenseMatrixByColnames(as.matrix(as.data.frame(unname(ChIP_peaks_annoTSS))), "peak")

ADD REPLYlink written 4 weeks ago by Ou, Jianhong1.1k

That works, thanks! 

ADD REPLYlink written 4 weeks ago by 941330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 235 users visited in the last hour