Hi,
I have downloaded a chip-seq .bed file from an available geo dataset (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM590111) and would like to use ChIPseeker to plot the Average Profile of ChIP peaks binding to the TSS region of a specific group of genes. However, I cannot seem to pass the annotatePeak() function, since it is running for the past almost 4 days (see below)... and with no signs of being almost finishing the annotation.
What am I doing wrong?
The code I have used is this:
# Load required libraries library(ChIPseeker) library(org.Mm.eg.db) require(TxDb.Mmusculus.UCSC.mm9.knownGene) # Use readPeakFile to load the peak and store in GRanges object sample = readPeakFile("GSM590111_E14-serum_H3K4me3-ChIP_Seq.bed.gz") # Annotate data txdb = TxDb.Mmusculus.UCSC.mm9.knownGene sample_ann = annotatePeak(sample, tssRegion=c(-3000, 3000),TxDb=txdb)
These are the messages I am getting (I am running this from linux):
>> preparing features information... 2015-07-23 19:03:40 >> identifying nearest features... 2015-07-23 19:03:41 >> calculating distance from peak to TSS... 2015-07-23 19:11:49 >> assigning genomic annotation... 2015-07-23 19:11:49
As you can see, this is running since 7pm on the 23rd of July...
Help please!
Thanks!
E.
P.S. Here is the session info:
> sessionInfo() R version 3.2.1 (2015-06-18) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.9.5 (Mavericks) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] TxDb.Mmusculus.UCSC.mm9.knownGene_3.1.2 GenomicFeatures_1.20.1 [3] GenomicRanges_1.20.5 org.Mm.eg.db_3.1.2 [5] RSQLite_1.0.0 DBI_0.3.1 [7] AnnotationDbi_1.30.1 GenomeInfoDb_1.4.1 [9] IRanges_2.2.5 S4Vectors_0.6.1 [11] Biobase_2.28.0 BiocGenerics_0.14.0 [13] ChIPseeker_1.4.3
Thanks Herve. This is really remarkable. FYI, I add you as a contributor in the author list, see https://github.com/GuangchuangYu/ChIPseeker/commit/afac661613f9a99173c296a76163980fbc1360a0
After using the efficient implementation of getFirstHitIndex(), it runs also less than 5min on my computer.
I have commit this new implementation to both release (1.4.6) and devel (1.5.8).
Bests,
Guangchuang
Awesome. Thanks! H.
Thanks Herve and Guangchuang for the tests, explanations and new implementations!! You are the best! :)
How long will it take for the new updated version to become available in the "update" packages section of R?
E.
It may take 2 or 3 days.
Cool! Thanks!!
already available. you can use biocLite to install the latest version.