Entering edit mode
divyswar01
•
0
@divyswar01-11905
Last seen 6.0 years ago
I am using cpGDensityCalc function in repitools package to calculate the cpg density for my data. Following is the code that I am using.
library(Repitools)
library(BSgenome.Hsapiens.UCSC.hg19)
cpgd<-Repitools::cpgDensityCalc(granges(bs),organism=Hsapiens,window =
1000)
and when I look at min(cpgd)
it is giving me 0 and there are several cpgd equal to zero. Does that mean that the those CpGs are in zero density region. That doesn't make sense because how could a 1kB window have 0 CpG density and still have a CpG in it.
bs
in my code is a bsseq object with coverage and methylation for each CpG. I am confused by these results, any help would be appreciated!
This is how my granges(bs)
looks like
GRanges object with 11598181 ranges and 0 metadata columns: seqnames ranges strand <Rle> <IRanges> <Rle> [1] chr1 [10497, 10497] * [2] chr1 [10525, 10525] * [3] chr1 [10542, 10542] * [4] chr1 [10563, 10563] * [5] chr1 [10571, 10571] * ... ... ... ... [11598177] chrY [56886944, 56886944] * [11598178] chrY [56886954, 56886954] * [11598179] chrY [56887025, 56887025] * [11598180] chrY [56887089, 56887089] * [11598181] chrY [56887099, 56887099] * ------- seqinfo: 24 sequences from an unspecified genome; no seqlengths
Following is my sessionInfo()
R version 3.2.2 (2015-08-14) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux Server release 6.7 (Santiago) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] gplots_3.0.1 BSgenome.Hsapiens.UCSC.hg19_1.4.0 BSgenome_1.38.0 [4] rtracklayer_1.30.4 Biostrings_2.38.4 XVector_0.10.0 [7] Repitools_1.16.0 mgcv_1.8-16 nlme_3.1-131 [10] AnnotationHub_2.2.5 snpar_1.0 MASS_7.3-45 [13] ape_4.0 bsseq_1.6.0 SummarizedExperiment_1.0.2 [16] Biobase_2.30.0 tsne_0.1-3 GenomicRanges_1.22.4 [19] GenomeInfoDb_1.6.3 IRanges_2.4.8 S4Vectors_0.8.11 [22] BiocGenerics_0.16.1 BiocInstaller_1.20.3 proxy_0.4-16 [25] reshape2_1.4.2 ggplot2_2.2.1 dplyr_0.5.0 loaded via a namespace (and not attached): [1] bitops_1.0-6 matrixStats_0.51.0 httr_1.2.1 [4] R.cache_0.12.0 tools_3.2.2 affyio_1.40.0 [7] R6_2.2.0 KernSmooth_2.23-15 DBI_0.5-1 [10] lazyeval_0.2.0 colorspace_1.3-2 DNAcopy_1.44.0 [13] gsmoothr_0.1.7 preprocessCore_1.32.0 labeling_0.3 [16] caTools_1.17.1 scales_0.4.1 affy_1.48.0 [19] genefilter_1.52.1 stringr_1.1.0 digest_0.6.12 [22] Rsamtools_1.22.0 R.utils_2.5.0 base64enc_0.1-3 [25] htmltools_0.3.5 limma_3.26.9 R.huge_0.9.0 [28] RSQLite_1.1-2 shiny_1.0.0 BiocParallel_1.4.3 [31] gtools_3.5.0 R.oo_1.21.0 RCurl_1.95-4.8 [34] magrittr_1.5 R.devices_2.15.1 futile.logger_1.4.3 [37] Matrix_1.2-8 Rcpp_0.12.9 munsell_0.4.3 [40] R.methodsS3_1.7.1 vsn_3.38.0 stringi_1.1.2 [43] edgeR_3.12.1 zlibbioc_1.16.0 plyr_1.8.4 [46] grid_3.2.2 gdata_2.17.0 listenv_0.6.0 [49] aroma.apd_0.6.0 lattice_0.20-34 splines_3.2.2 [52] annotate_1.48.0 R.filesets_2.11.0 locfit_1.5-9.1 [55] PSCBS_0.62.0 codetools_0.2-15 futile.options_1.0.0 [58] XML_3.98-1.6 lambda.r_1.1.9 data.table_1.10.4 [61] httpuv_1.3.3 gtable_0.2.0 future_1.4.0 [64] Ringo_1.34.0 assertthat_0.1 mime_0.5 [67] aroma.affymetrix_3.1.0 xtable_1.8-2 aroma.core_3.1.0 [70] Rsolnp_1.16 survival_2.40-1 truncnorm_1.0-7 [73] tibble_1.2 GenomicAlignments_1.6.3 AnnotationDbi_1.32.3 [76] memoise_1.0.0 cluster_2.0.5 globals_0.9.0 [79] interactiveDisplayBase_1.8.0 R.rsp_0.40.0
I am guessing it is a problem with the reference genome that I have been using. I will update after checking on that.
As I have though it it the wrong genomic build that lead to this spurious results.