ChIPpeakAnno problem w getAllPeakSequence()
1
0
Entering edit mode
Anna ▴ 50
@anna-5560
Last seen 7.0 years ago
Sweden
Dear list, I've just started analyzing a ChIP-seq data set. I generated a peak list using "standard" utilities (bowtie, MACS) and loaded it into R in the ChIPpeakAnno package. I managed to annotate the peaks but when I tried to retrieve the peak sequences using the getAllPeakSequence() function I ran into a problem: peaksequences<-getAllPeakSequence(mergedpeakannotations, upstream=100, downstream=100, genome=mart, AnnotationData=getAnnotation(mart, featureType="TSS")) Error in getBM(c(seqType, type), filters = c(type, "upstream_flank"), : Query ERROR: caught BioMart::Exception::Usage: Filter upstream_flank NOT FOUND The error message sometimes states downstream_flank instead of upstream. At one point I got it to run on a small subset of the data (first 10 rows) but later the same command failed with this same error message. Some info about the objects: mergedpeakannotations: RangedData object, created using mergedpeakannotations<- annotatePeakInBatch(mergedpeak.rd, AnnotationData=getAnnotation(mart, featureType="TSS")) mart=useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl") Anybody got a clue about why this fails? All help would be much appreciated! Best regards! Anna sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 [3] LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C [5] LC_TIME=Swedish_Sweden.1252 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] ChIPpeakAnno_2.6.0 limma_3.14.1 [3] org.Hs.eg.db_2.8.0 GO.db_2.8.0 [5] RSQLite_0.11.2 DBI_0.2-5 [7] AnnotationDbi_1.20.1 BSgenome.Ecoli.NCBI.20080805_1.3.17 [9] BSgenome_1.26.1 GenomicRanges_1.10.2 [11] Biostrings_2.26.2 IRanges_1.16.2 [13] multtest_2.14.0 Biobase_2.18.0 [15] biomaRt_2.14.0 BiocGenerics_0.4.0 [17] VennDiagram_1.5.1 loaded via a namespace (and not attached): [1] MASS_7.3-22 parallel_2.15.1 RCurl_1.95-1.1 splines_2.15.1 stats4_2.15.1 [6] survival_2.36-14 tools_2.15.1 XML_3.95-0.1 [[alternative HTML version deleted]]
GO annotate ChIPpeakAnno GO annotate ChIPpeakAnno • 1.4k views
ADD COMMENT
0
Entering edit mode
Ou, Jianhong ★ 1.3k
@ou-jianhong-4539
Last seen 1 day ago
United States
Hi Anna, To get the peak sequences you can try BSgenome + BStrings first. Try ?getSeq to extract the sequence from local database, faster than remote mart. Because I can not repeat your error, could you please share the subset of you data to me for testing? Yours sincerely, Jianhong Ou jianhong.ou at umassmed.edu On Oct 17, 2012, at 9:14 AM, Anna Ehrlund wrote: > Dear list, > > I've just started analyzing a ChIP-seq data set. I generated a peak list using "standard" utilities (bowtie, MACS) and loaded it into R in the ChIPpeakAnno package. I managed to annotate the peaks but when I tried to retrieve the peak sequences using the getAllPeakSequence() function I ran into a problem: > > > peaksequences<-getAllPeakSequence(mergedpeakannotations, upstream=100, downstream=100, genome=mart, AnnotationData=getAnnotation(mart, featureType="TSS")) > Error in getBM(c(seqType, type), filters = c(type, "upstream_flank"), : > Query ERROR: caught BioMart::Exception::Usage: Filter upstream_flank NOT FOUND > > The error message sometimes states downstream_flank instead of upstream. At one point I got it to run on a small subset of the data (first 10 rows) but later the same command failed with this same error message. > > Some info about the objects: > mergedpeakannotations: RangedData object, created using mergedpeakannotations<- annotatePeakInBatch(mergedpeak.rd, AnnotationData=getAnnotation(mart, featureType="TSS")) > > mart=useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl") > > Anybody got a clue about why this fails? All help would be much appreciated! > > Best regards! > Anna > > > > sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 > [3] LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C > [5] LC_TIME=Swedish_Sweden.1252 > > attached base packages: > [1] grid stats graphics grDevices utils datasets methods base > > other attached packages: > [1] ChIPpeakAnno_2.6.0 limma_3.14.1 > [3] org.Hs.eg.db_2.8.0 GO.db_2.8.0 > [5] RSQLite_0.11.2 DBI_0.2-5 > [7] AnnotationDbi_1.20.1 BSgenome.Ecoli.NCBI.20080805_1.3.17 > [9] BSgenome_1.26.1 GenomicRanges_1.10.2 > [11] Biostrings_2.26.2 IRanges_1.16.2 > [13] multtest_2.14.0 Biobase_2.18.0 > [15] biomaRt_2.14.0 BiocGenerics_0.4.0 > [17] VennDiagram_1.5.1 > > loaded via a namespace (and not attached): > [1] MASS_7.3-22 parallel_2.15.1 RCurl_1.95-1.1 splines_2.15.1 stats4_2.15.1 > [6] survival_2.36-14 tools_2.15.1 XML_3.95-0.1 > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Hi Anna, And a better way to resolve your problem is that use BSgenome as annotationData for getAllPeakSequence() function. Try, library(BSgenome.Hsapiens.UCSC.hg19) peaksequences <- getAllPeakSequence(mergedpeakannotations, upstream=100, downstream=100, genome=Hsapiens) Yours sincerely, Jianhong Ou jianhong.ou at umassmed.edu On Oct 17, 2012, at 12:03 PM, wrote: > Hi Anna, > > To get the peak sequences you can try BSgenome + BStrings first. Try ?getSeq to extract the sequence from local database, faster than remote mart. > > Because I can not repeat your error, could you please share the subset of you data to me for testing? > > Yours sincerely, > > Jianhong Ou > > jianhong.ou at umassmed.edu > > > On Oct 17, 2012, at 9:14 AM, Anna Ehrlund wrote: > >> Dear list, >> >> I've just started analyzing a ChIP-seq data set. I generated a peak list using "standard" utilities (bowtie, MACS) and loaded it into R in the ChIPpeakAnno package. I managed to annotate the peaks but when I tried to retrieve the peak sequences using the getAllPeakSequence() function I ran into a problem: >> >> >> peaksequences<-getAllPeakSequence(mergedpeakannotations, upstream=100, downstream=100, genome=mart, AnnotationData=getAnnotation(mart, featureType="TSS")) >> Error in getBM(c(seqType, type), filters = c(type, "upstream_flank"), : >> Query ERROR: caught BioMart::Exception::Usage: Filter upstream_flank NOT FOUND >> >> The error message sometimes states downstream_flank instead of upstream. At one point I got it to run on a small subset of the data (first 10 rows) but later the same command failed with this same error message. >> >> Some info about the objects: >> mergedpeakannotations: RangedData object, created using mergedpeakannotations<- annotatePeakInBatch(mergedpeak.rd, AnnotationData=getAnnotation(mart, featureType="TSS")) >> >> mart=useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl") >> >> Anybody got a clue about why this fails? All help would be much appreciated! >> >> Best regards! >> Anna >> >> >> >> sessionInfo() >> R version 2.15.1 (2012-06-22) >> Platform: x86_64-pc-mingw32/x64 (64-bit) >> >> locale: >> [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 >> [3] LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C >> [5] LC_TIME=Swedish_Sweden.1252 >> >> attached base packages: >> [1] grid stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] ChIPpeakAnno_2.6.0 limma_3.14.1 >> [3] org.Hs.eg.db_2.8.0 GO.db_2.8.0 >> [5] RSQLite_0.11.2 DBI_0.2-5 >> [7] AnnotationDbi_1.20.1 BSgenome.Ecoli.NCBI.20080805_1.3.17 >> [9] BSgenome_1.26.1 GenomicRanges_1.10.2 >> [11] Biostrings_2.26.2 IRanges_1.16.2 >> [13] multtest_2.14.0 Biobase_2.18.0 >> [15] biomaRt_2.14.0 BiocGenerics_0.4.0 >> [17] VennDiagram_1.5.1 >> >> loaded via a namespace (and not attached): >> [1] MASS_7.3-22 parallel_2.15.1 RCurl_1.95-1.1 splines_2.15.1 stats4_2.15.1 >> [6] survival_2.36-14 tools_2.15.1 XML_3.95-0.1 >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Hi Jianhong, Thanks so much for your help, your suggestion to use the BSgenome as annotaionData worked perfectly. Problem solved for me. The bioMart-dependent code still fails on the same data set, not sure why, perhaps a glitch in my bioMart connection or something (no other network problems here though). I'll send you a subset of my data (that fails) off-list if you want to try and reproduce the problem. Again, thanks for your quick help! Anna ________________________________________ Fr?n: Ou, Jianhong [Jianhong.Ou at umassmed.edu] Skickat: den 17 oktober 2012 18:22 Till: Anna Ehrlund Cc: <bioconductor at="" r-project.org=""> ?mne: Re: [BioC] ChIPpeakAnno problem w getAllPeakSequence() Hi Anna, And a better way to resolve your problem is that use BSgenome as annotationData for getAllPeakSequence() function. Try, library(BSgenome.Hsapiens.UCSC.hg19) peaksequences <- getAllPeakSequence(mergedpeakannotations, upstream=100, downstream=100, genome=Hsapiens) Yours sincerely, Jianhong Ou jianhong.ou at umassmed.edu On Oct 17, 2012, at 12:03 PM, wrote: > Hi Anna, > > To get the peak sequences you can try BSgenome + BStrings first. Try ?getSeq to extract the sequence from local database, faster than remote mart. > > Because I can not repeat your error, could you please share the subset of you data to me for testing? > > Yours sincerely, > > Jianhong Ou > > jianhong.ou at umassmed.edu > > > On Oct 17, 2012, at 9:14 AM, Anna Ehrlund wrote: > >> Dear list, >> >> I've just started analyzing a ChIP-seq data set. I generated a peak list using "standard" utilities (bowtie, MACS) and loaded it into R in the ChIPpeakAnno package. I managed to annotate the peaks but when I tried to retrieve the peak sequences using the getAllPeakSequence() function I ran into a problem: >> >> >> peaksequences<-getAllPeakSequence(mergedpeakannotations, upstream=100, downstream=100, genome=mart, AnnotationData=getAnnotation(mart, featureType="TSS")) >> Error in getBM(c(seqType, type), filters = c(type, "upstream_flank"), : >> Query ERROR: caught BioMart::Exception::Usage: Filter upstream_flank NOT FOUND >> >> The error message sometimes states downstream_flank instead of upstream. At one point I got it to run on a small subset of the data (first 10 rows) but later the same command failed with this same error message. >> >> Some info about the objects: >> mergedpeakannotations: RangedData object, created using mergedpeakannotations<- annotatePeakInBatch(mergedpeak.rd, AnnotationData=getAnnotation(mart, featureType="TSS")) >> >> mart=useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl") >> >> Anybody got a clue about why this fails? All help would be much appreciated! >> >> Best regards! >> Anna >> >> >> >> sessionInfo() >> R version 2.15.1 (2012-06-22) >> Platform: x86_64-pc-mingw32/x64 (64-bit) >> >> locale: >> [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 >> [3] LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C >> [5] LC_TIME=Swedish_Sweden.1252 >> >> attached base packages: >> [1] grid stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] ChIPpeakAnno_2.6.0 limma_3.14.1 >> [3] org.Hs.eg.db_2.8.0 GO.db_2.8.0 >> [5] RSQLite_0.11.2 DBI_0.2-5 >> [7] AnnotationDbi_1.20.1 BSgenome.Ecoli.NCBI.20080805_1.3.17 >> [9] BSgenome_1.26.1 GenomicRanges_1.10.2 >> [11] Biostrings_2.26.2 IRanges_1.16.2 >> [13] multtest_2.14.0 Biobase_2.18.0 >> [15] biomaRt_2.14.0 BiocGenerics_0.4.0 >> [17] VennDiagram_1.5.1 >> >> loaded via a namespace (and not attached): >> [1] MASS_7.3-22 parallel_2.15.1 RCurl_1.95-1.1 splines_2.15.1 stats4_2.15.1 >> [6] survival_2.36-14 tools_2.15.1 XML_3.95-0.1 >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 602 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6