Note: I have posted this on GitHub, as this seems more appropriate but I do not know how to remove a post.
I annotated using ChIPseeker mouse chip-seq data (aligned to GRCm38). I am interested in seeing the distribution of all the reads, not only of the peaks. For this purpose, I have downsampled the alignment bam files to 1M reads, and converted the file to bed format (hope this is kosher). The TxDb was created using Ensembl database.
Problem : The resulting geneChr column in the annotated files (should reflect the chromosome of the nearest gene) makes no sense - the chromosomes numbers do not exist in mouse. See bellow.
txdb <- makeTxDbFromBiomart(dataset="mmusculus_gene_ensembl")
file_list <- list(WT = "1.bed", CKO = "2.bed")
# Checking to see the chromosome names are ok in the files
unique(as.data.frame(files[[1]])$seqnames)
[1] 1 2 3 4 5 6 7 8 9
[10] 10 11 12 13 14 15 16 17 18
[19] 19 X Y MT GL456233.1 GL456211.1 JH584304.1 GL456379.1 GL456216.1
[28] GL456393.1 GL456366.1 GL456383.1 GL456360.1 GL456378.1 GL456389.1 GL456370.1 GL456390.1 GL456394.1
[37] GL456392.1 GL456396.1 GL456368.1
39 Levels ...
files_anno <- lapply(files, annotatePeak, TxDb=txdb, tssRegion = c(-3000,3000), verbose=TRUE)
# Why should the gene chromosomes be such?
unique(as.data.frame(files_anno[[1]])$geneChr)
[1] 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 103 97 139
[26] 100
sessionInfo( )
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] diffloop_1.12.0 GenomicFeatures_1.36.4 AnnotationDbi_1.46.1 Biobase_2.44.0
[5] GenomicRanges_1.36.1 GenomeInfoDb_1.20.0 IRanges_2.18.3 S4Vectors_0.22.1
[9] BiocGenerics_0.30.0 ChIPseeker_1.20.0
loaded via a namespace (and not attached):
[1] fgsea_1.10.1 colorspace_2.0-0
[3] ellipsis_0.3.1 ggridges_0.5.2
[5] qvalue_2.16.0 XVector_0.24.0
[7] base64enc_0.1-3 rstudioapi_0.13
[9] farver_2.0.3 urltools_1.7.3
[11] graphlayouts_0.7.1 ggrepel_0.8.2
[13] bit64_4.0.5 xml2_1.3.2
[15] codetools_0.2-18 splines_3.6.3
[17] GOSemSim_2.10.0 knitr_1.28
[19] polyclip_1.10-0 jsonlite_1.7.1
[21] Rsamtools_2.0.3 gridBase_0.4-7
[23] GO.db_3.8.2 ggforce_0.3.2
[25] readr_1.4.0 BiocManager_1.30.10
[27] compiler_3.6.3 httr_1.4.2
[29] rvcheck_0.1.8 Matrix_1.2-18
[31] limma_3.40.6 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[33] tweenr_1.0.1 htmltools_0.5.0
[35] prettyunits_1.1.1 tools_3.6.3
[37] igraph_1.2.6 gtable_0.3.0
[39] glue_1.4.2 GenomeInfoDbData_1.2.1
[41] reshape2_1.4.4 DO.db_2.9
[43] dplyr_1.0.2 fastmatch_1.1-0
[45] Rcpp_1.0.5 enrichplot_1.4.0
[47] vctrs_0.3.5 Biostrings_2.52.0
[49] rtracklayer_1.44.4 iterators_1.0.13
[51] ggraph_2.0.4 xfun_0.13
[53] stringr_1.4.0 lifecycle_0.2.0
[55] gtools_3.8.2 statmod_1.4.35
[57] XML_3.99-0.3 Sushi_1.22.0
[59] DOSE_3.10.2 edgeR_3.26.8
[61] zoo_1.8-8 europepmc_0.4
[63] zlibbioc_1.30.0 MASS_7.3-53
[65] scales_1.1.1 tidygraph_1.2.0
[67] hms_0.5.3 SummarizedExperiment_1.14.1
[69] RColorBrewer_1.1-2 yaml_2.2.1
[71] curl_4.3 pbapply_1.4-3
[73] memoise_1.1.0 gridExtra_2.3
[75] ggplot2_3.3.2 UpSetR_1.4.0
[77] biomaRt_2.40.5 triebeard_0.3.0
[79] stringi_1.5.3 RSQLite_2.2.1
[81] foreach_1.5.1 plotrix_3.7-8
[83] caTools_1.18.0 boot_1.3-25
[85] BiocParallel_1.18.1 rlang_0.4.8
[87] pkgconfig_2.0.3 matrixStats_0.57.0
[89] bitops_1.0-6 evaluate_0.14
[91] lattice_0.20-41 purrr_0.3.4
[93] labeling_0.4.2 GenomicAlignments_1.20.1
[95] cowplot_1.1.0 bit_4.0.4
[97] tidyselect_1.1.0 plyr_1.8.6
[99] magrittr_2.0.1 R6_2.5.0
[101] gplots_3.1.0 generics_0.1.0
[103] DelayedArray_0.10.0 DBI_1.1.0
[105] pillar_1.4.6 RCurl_1.98-1.2
[107] tibble_3.0.4 crayon_1.3.4
[109] KernSmooth_2.23-18 rmarkdown_2.1
[111] viridis_0.5.1 progress_1.2.2
[113] locfit_1.5-9.4 grid_3.6.3
[115] data.table_1.13.2 blob_1.2.1
[117] digest_0.6.27 tidyr_1.1.2
[119] gridGraphics_0.5-0 munsell_0.5.0
[121] viridisLite_0.3.0 ggplotify_0.0.5