Dear RIPSeeker people,
I tried to make RIPSeeker work by testing the system data. But I got an error as below, which seems an error when using biomaRt.
Actually, I checked the host of "dec2011.archive.ensembl.org", which is not down when I run the code.
Here is the code that I used:
extdata.dir <- system.file("extdata", package="RIPSeeker")
bamFiles <- list.files(extdata.dir, "\\.bam$", recursive=TRUE, full.names=TRUE)
bamFiles <- grep("PRC2", bamFiles, value=TRUE)
# specify control name
cNAME <- "SRR039214"
# output file directory
outDir <- file.path(getwd(), "RIPSeeker_vigenette_example_PRC2")
# Parameters setting
binSize <- NULL # set to NULL to automatically determine bin size
minBinSize <- 10000 # min bin size in automatic bin size selection
maxBinSize <- 10100
multicore <- TRUE
strandType <- "-"
biomart <- "ENSEMBL_MART_ENSEMBL"
biomaRt_dataset <- "mmusculus_gene_ensembl"
host <- "dec2011.archive.ensembl.org"
# host = "www.ensembl.org"
goAnno <- "org.Mm.eg.db"
seekOut.PRC2 <- ripSeek(bamPath = bamFiles, cNAME = cNAME, reverseComplement = TRUE, genomeBuild = "mm9", strandType = strandType, uniqueHit = TRUE, assignMultihits = TRUE, rerunWithDisambiguatedMultihits = TRUE, binSize=binSize, minBinSize = minBinSize, maxBinSize = maxBinSize, biomart=biomart, host=host, biomaRt_dataset = biomaRt_dataset, goAnno = goAnno, multicore=multicore, outDir=outDir)
........ # the console is truncated from annotation part and the error is as below.
*IV. Annotate RIP regions via online ensembl database (mmusculus_gene_ensembl):
Entity 'nbsp' not defined
attributes construct error
Couldn't find end of Start Tag img line 22
Entity 'hellip' not defined
Entity 'hellip' not defined
Entity 'nbsp' not defined
Entity 'raquo' not defined
attributes construct error
Couldn't find end of Start Tag img line 40
Entity 'hellip' not defined
Entity 'hellip' not defined
Entity 'hellip' not defined
Entity 'hellip' not defined
Entity 'hellip' not defined
Opening and ending tag mismatch: img line 64 and li
Opening and ending tag mismatch: li line 64 and ul
Opening and ending tag mismatch: ul line 63 and div
Entity 'copy' not defined
attributes construct error
Couldn't find end of Start Tag img line 225
attributes construct error
Couldn't find end of Start Tag img line 228
attributes construct error
Couldn't find end of Start Tag img line 231
attributes construct error
Couldn't find end of Start Tag img line 260
Opening and ending tag mismatch: div line 18 and body
Opening and ending tag mismatch: body line 17 and html
Premature end of data in tag html line 2
Error: 1: Entity 'nbsp' not defined
2: attributes construct error
3: Couldn't find end of Start Tag img line 22
4: Entity 'hellip' not defined
5: Entity 'hellip' not defined
6: Entity 'nbsp' not defined
7: Entity 'raquo' not defined
8: attributes construct error
9: Couldn't find end of Start Tag img line 40
10: Entity 'hellip' not defined
11: Entity 'hellip' not defined
12: Entity 'hellip' not defined
13: Entity 'hellip' not defined
14: Entity 'hellip' not defined
15: Opening and ending tag mismatch: img line 64 and li
16: Opening and ending tag mismatch: li line 64 and ul
17: Opening and ending tag mismatch: ul line 63 and div
18: Entity 'copy' not defined
19: attributes construct error
20: Couldn't find end of Start Tag img line 225
21: attributes construct error
22: Couldn't find end of Start Tag img line 228
23: attributes construct error
24: Couldn't find end of Start Tag img line 231
25: attributes construct error
26: Couldn't find end of Start Tag img line 260
27: Opening and e
Hi Yue,
Thanks for your reply! It totally make sense. Actually, for my real data, I used the ensembl build version GRCh37. So, I don't have this issue for my real experience.
However, I got a couple of questions about RIPSeeker. First, how does it differentiate biological reps and technical reps for the all-in-one function -- ripSeek. I see we only give the paths to the bam files and the function will combine the bam file if they are replicates but I don't know how does it know if it is biological or technical reps. Or, it treats them as same?
Second, I got an error when I run the function of ripSeek. Here is the code that I used and the session info. I searched it a little bit. Here is a link that might be referred to my error got an error with wavClusteR package; NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append).....
Third, why it shows "do not have any alignment" for all chromosomes?
Thanks a lot!
Best,
Xiaofei
> binSize <- NULL # automatically determine bin size
> minBinSize <- 10000
> maxBinSize <- 10100
> multicore <- TRUE
> strandType <- "*"
> biomart <- "ensembl"
> biomaRt_dataset <- "hsapiens_gene_ensembl" # human dataset id name
> goAnno <- "org.Hs.eg.db" # GO annotation database
>
> seekOut.HuR <- ripSeek(bamPath = bamFiles, cNAME = ctrl, reverseComplement = FALSE, genomeBuild = "hg19", strandType = strandType, uniqueHit = TRUE, assignMultihits = TRUE, rerunWithDisambiguatedMultihits = TRUE, binSize=binSize, minBinSize = minBinSize, maxBinSize = maxBinSize, biomart=biomart, goAnno = goAnno, biomaRt_dataset = biomaRt_dataset, multicore=multicore, outDir=outDir)
*I. Collect alignment files
RIP alignment files:
/Volumes/saturn/xiaofei/Xiaoqing/tophat_out/HuR_S1_tophat.bam
/Volumes/saturn/xiaofei/Xiaoqing/tophat_out/HuR_S2_tophat.bam
Control alignment files:
/Volumes/saturn/xiaofei/Xiaoqing/tophat_out/IgG_S1_tophat.bam
/Volumes/saturn/xiaofei/Xiaoqing/tophat_out/IgG_S2_tophat.bam
*II. Analyzing RIP library:
**A. Process and combine alignment files
Processing /Volumes/saturn/xiaofei/Xiaoqing/tophat_out/HuR_S1_tophat.bam ... All hits are returned with flags.
Processing /Volumes/saturn/xiaofei/Xiaoqing/tophat_out/HuR_S2_tophat.bam ... All hits are returned with flags.
2 BAM files are combined
*** Only reads from strand * will be considered.
*** Only unique hits are used to compute read count.
*** 1 do not have any alignment.
*** 10 do not have any alignment.
*** 11 do not have any alignment.
*** 12 do not have any alignment.
*** 13 do not have any alignment.
*** 14 do not have any alignment.
*** 15 do not have any alignment.
*** 16 do not have any alignment.
*** 17 do not have any alignment.
*** 18 do not have any alignment.
*** 19 do not have any alignment.
*** 2 do not have any alignment.
*** 20 do not have any alignment.
*** 21 do not have any alignment.
*** 22 do not have any alignment.
*** 3 do not have any alignment.
*** 4 do not have any alignment.
*** 5 do not have any alignment.
*** 6 do not have any alignment.
*** 7 do not have any alignment.
*** 8 do not have any alignment.
*** 9 do not have any alignment.
*** MT do not have any alignment.
*** X do not have any alignment.
*** Y do not have any alignment.
Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) :
subscript contains NAs or out-of-bounds indices
> sessionInfo()
R version 3.1.3 (2015-03-09)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
Running under: OS X 10.8.5 (Mountain Lion)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] biomaRt_2.22.0 BiocInstaller_1.16.5 RIPSeeker_1.6.0 rtracklayer_1.26.3 GenomicAlignments_1.2.2
[6] Rsamtools_1.18.3 Biostrings_2.34.1 XVector_0.6.0 GenomicRanges_1.18.4 GenomeInfoDb_1.2.5
[11] IRanges_2.0.1 S4Vectors_0.4.0 BiocGenerics_0.12.1
loaded via a namespace (and not attached):
[1] AnnotationDbi_1.28.2 base64enc_0.1-2 BatchJobs_1.6 BBmisc_1.9 Biobase_2.26.0 BiocParallel_1.0.3
[7] bitops_1.0-6 brew_1.0-6 checkmate_1.5.3 codetools_0.2-11 DBI_0.3.1 digest_0.6.8
[13] fail_1.2 foreach_1.4.2 iterators_1.0.7 magrittr_1.5 RCurl_1.95-4.6 RSQLite_1.0.0
[19] sendmailR_1.2-1 stringi_0.4-1 stringr_1.0.0 tools_3.1.3 XML_3.98-1.1 zlibbioc_1.12.0