Entering edit mode
sophialovechan
▴
10
@sophialovechan-15987
Last seen 6.4 years ago
Hi, I am using BSgenome.Hsapiens.UCSC.hg19 trying to add GC bias into my ATAC sequencing data. But I kept having a problem when loading hg19 genome. Could anyone help me with this?
> counts_GC <- addGCBias(counts, genome = BSgenome.Hsapiens.UCSC.hg19) Error in loadFUN(x, seqname, ranges) : trying to load regions beyond the boundaries of non-circular sequence "chr17" > sessionInfo() R version 3.5.0 (2018-04-23) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS High Sierra 10.13.4 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] BiocInstaller_1.30.0 BSgenome.Hsapiens.UCSC.hg19_1.4.0 [3] BSgenome_1.48.0 rtracklayer_1.40.2 [5] Biostrings_2.48.0 XVector_0.20.0 [7] SummarizedExperiment_1.10.1 DelayedArray_0.6.0 [9] BiocParallel_1.14.1 matrixStats_0.53.1 [11] Biobase_2.40.0 GenomicRanges_1.32.3 [13] GenomeInfoDb_1.16.0 IRanges_2.14.10 [15] S4Vectors_0.18.2 BiocGenerics_0.26.0 [17] Matrix_1.2-14 motifmatchr_1.2.0 [19] chromVAR_1.2.0 loaded via a namespace (and not attached): [1] bitops_1.0-6 DirichletMultinomial_1.22.0 TFBSTools_1.18.0 [4] bit64_0.9-7 httr_1.3.1 tools_3.5.0 [7] R6_2.2.2 DT_0.4 seqLogo_1.46.0 [10] DBI_1.0.0 lazyeval_0.2.1 colorspace_1.3-2 [13] tidyselect_0.2.4 bit_1.1-14 compiler_3.5.0 [16] plotly_4.7.1 caTools_1.17.1 scales_0.5.0 [19] readr_1.1.1 stringr_1.3.1 digest_0.6.15 [22] Rsamtools_1.32.0 R.utils_2.6.0 pkgconfig_2.0.1 [25] htmltools_0.3.6 htmlwidgets_1.2 rlang_0.2.1 [28] RSQLite_2.1.1 VGAM_1.0-5 shiny_1.1.0 [31] bindr_0.1.1 jsonlite_1.5 gtools_3.5.0 [34] dplyr_0.7.5 R.oo_1.22.0 RCurl_1.95-4.10 [37] magrittr_1.5 GO.db_3.6.0 GenomeInfoDbData_1.1.0 [40] Rcpp_0.12.17 munsell_0.4.3 R.methodsS3_1.7.1 [43] stringi_1.2.2 zlibbioc_1.26.0 plyr_1.8.4 [46] grid_3.5.0 blob_1.1.1 promises_1.0.1 [49] miniUI_0.1.1.1 CNEr_1.16.0 lattice_0.20-35 [52] splines_3.5.0 annotate_1.58.0 hms_0.4.2 [55] KEGGREST_1.20.0 pillar_1.2.3 reshape2_1.4.3 [58] TFMPvalue_0.0.8 XML_3.98-1.11 glue_1.2.0 [61] data.table_1.11.4 png_0.1-7 httpuv_1.4.3 [64] tidyr_0.8.1 gtable_0.2.0 poweRlaw_0.70.1 [67] purrr_0.2.5 assertthat_0.2.0 ggplot2_2.2.1 [70] mime_0.5 xtable_1.8-2 later_0.7.2 [73] viridisLite_0.3.0 tibble_1.4.2 GenomicAlignments_1.16.0 [76] AnnotationDbi_1.42.1 memoise_1.1.0 bindrcpp_0.2.2
I tried hg38 as well but it run into the same error with chr1.
Well, you should check the rowRanges slot of your counts object, particularly for chr17, to see if there are sequences that are > 81195210. If so, you will need to figure out why.
Thank you very much. I am a biologist but not a bioinformatician so I kind of have no clues about your suggestion. But thanks anyway.
I looked at rowRanges slot of my counts objects and I didn't know where to find the information you mentioned. Here is an example of data I got from rowRanges.
Using the example data from chromVAR:
Yes. I saw the last line with sequences > 81195210
OK, so that's a problem. You need to figure out why you have counts from a region of chr17 that don't exist in the hg19 genome that you said you used. This isn't a Bioconductor question, really, as the software is doing what it is supposed to do. You just have data that don't fulfill expectations, so you need to run that down.
OK. Thanks for your help.
Or more directly
counts_GC <- addGCBias(counts, genome = BSgenome.Hsapiens.UCSC.hg38)
Error in loadFUN(x, seqname, ranges) :
trying to load regions beyond the boundaries of non-circular sequence "chr1"
I checked my alignment and I used hg19 when I run bowtie.