Hello,
I have run DESeq analysis on my RNAseq data to find differential expression analysis. Then I wanted to look on gene sets and I used gage package. I can output my data and visualise in pathview. I would like to ask you if I can use the output from gage and use in to make a dot plot like the one shown here https://bioinformatics-core-shared-training.github.io/cruk-summer-school-2018/RNASeq2018/html/06_Gene_set_testing.nb.html#go-enrichment-analysis in the goseq package.
The code to make this plot taken from the site is the following:
goResults %>%
top_n(10, wt=-over_represented_pvalue) %>%
mutate(hitsPerc=numDEInCat*100/numInCat) %>%
ggplot(aes(x=hitsPerc,
y=term,
colour=over_represented_pvalue,
size=numDEInCat)) +
geom_point() +
expand_limits(x=0) +
labs(x="Hits (%)", y="GO term", colour="p value", size="Count")
Some explanations: over_rep_pval: p-value for over representation of the term in the differentially expressed genes numDEInCat: number of differentially expressed genes in this category numInCat: number of genes in this category term: detail of the term
When I use the gage package
fc.go.bp.p <- gage(res.fc, gsets = go.bp.gs)
# convert the go results to data frames
fc.go.bp.p.up <- as.data.frame(fc.go.bp.p$greater)
fc.go.bp.p.down <- as.data.frame(fc.go.bp.p$less)
fc.go.bp.p.down$term <-row.names(fc.go.bp.p.down)
#Plot the top 10
fc.go.bp.p.down %>%
top_n(10, wt=-q.val) %>%
mutate(hitsPerc=exp1*100) %>%
ggplot(aes(x=hitsPerc,
y=term,
colour=q.val,
size=set.size)) +
geom_point() +
expand_limits(x=0) +
labs(x="Hits (%)", y="GO term", colour="p value", size="Count")
I am not sure if the exp1 refers the the percentage as the term is not explained in the R documentation. Will the way I have written the plot give an analogous output as the goseq package? The output from gage is p.geomean,stat.mean,p.val,q.val,set.size and exp1.
Thanks, Maria
sessionInfo( )
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] rtracklayer_1.54.0 goseq_1.46.0
[3] geneLenDataBase_1.30.0 BiasedUrn_1.07
[5] pathview_1.32.0 org.Hs.eg.db_3.13.0
[7] GO.db_3.13.0 AnnotationDbi_1.56.2
[9] gage_2.42.0 ggbeeswarm_0.6.0
[11] RColorBrewer_1.1-2 vsn_3.60.0
[13] pcaExplorer_2.18.0 UpSetR_1.4.0
[15] pheatmap_1.0.12 magrittr_2.0.2
[17] DEFormats_1.20.0 reshape2_1.4.4
[19] ggpubr_0.4.0 DESeq2_1.32.0
[21] SummarizedExperiment_1.22.0 Biobase_2.52.0
[23] MatrixGenerics_1.4.3 matrixStats_0.61.0
[25] GenomicRanges_1.44.0 GenomeInfoDb_1.28.4
[27] IRanges_2.26.0 S4Vectors_0.30.0
[29] BiocGenerics_0.38.0 forcats_0.5.1
[31] stringr_1.4.0 dplyr_1.0.8
[33] purrr_0.3.4 readr_2.1.2
[35] tidyr_1.2.0 tibble_3.1.6
[37] ggplot2_3.3.5 tidyverse_1.3.1
[39] edgeR_3.34.1 limma_3.48.3
loaded via a namespace (and not attached):
[1] utf8_1.2.2 shinydashboard_0.7.2
[3] tidyselect_1.1.2 heatmaply_1.3.0
[5] RSQLite_2.2.10 htmlwidgets_1.5.4
[7] grid_4.1.2 TSP_1.1-11
[9] BiocParallel_1.26.2 munsell_0.5.0
[11] preprocessCore_1.54.0 codetools_0.2-18
[13] DT_0.21 withr_2.4.3
[15] colorspace_2.0-2 Category_2.60.0
[17] filelock_1.0.2 knitr_1.37
[19] rstudioapi_0.13 ggsignif_0.6.3
[21] NMF_0.23.0 labeling_0.4.2
[23] KEGGgraph_1.54.0 GenomeInfoDbData_1.2.7
[25] topGO_2.46.0 farver_2.1.0
[27] bit64_4.0.5 vctrs_0.3.8
[29] generics_0.1.2 xfun_0.29
[31] BiocFileCache_2.2.1 R6_2.5.1
[33] doParallel_1.0.17 seriation_1.3.2
[35] locfit_1.5-9.4 bitops_1.0-7
[37] cachem_1.0.6 shinyAce_0.4.1
[39] DelayedArray_0.18.0 assertthat_0.2.1
[41] BiocIO_1.4.0 promises_1.2.0.1
[43] scales_1.1.1 beeswarm_0.4.0
[45] gtable_0.3.0 affy_1.70.0
[47] rlang_1.0.1 genefilter_1.74.0
[49] splines_4.1.2 rstatix_0.7.0
[51] lazyeval_0.2.2 shinyBS_0.61
[53] broom_0.7.12 checkmate_2.0.0
[55] yaml_2.3.5 BiocManager_1.30.16
[57] abind_1.4-5 modelr_0.1.8
[59] GenomicFeatures_1.46.5 threejs_0.3.3
[61] crosstalk_1.2.0 backports_1.4.1
[63] httpuv_1.6.5 RBGL_1.68.0
[65] tools_4.1.2 gridBase_0.4-7
[67] affyio_1.62.0 ellipsis_0.3.2
[69] Rcpp_1.0.8 plyr_1.8.6
[71] base64enc_0.1-3 progress_1.2.2
[73] zlibbioc_1.38.0 RCurl_1.98-1.6
[75] prettyunits_1.1.1 viridis_0.6.2
[77] haven_2.4.3 ggrepel_0.9.1
[79] cluster_2.1.2 fs_1.5.2
[81] data.table_1.14.2 SparseM_1.81
[83] reprex_2.0.1 hms_1.1.1
[85] mime_0.12 evaluate_0.15
[87] xtable_1.8-4 XML_3.99-0.8
[89] readxl_1.3.1 gridExtra_2.3
[91] compiler_4.1.2 biomaRt_2.50.3
[93] crayon_1.5.0 htmltools_0.5.2
[95] GOstats_2.60.0 mgcv_1.8-39
[97] later_1.3.0 tzdb_0.2.0
[99] geneplotter_1.72.0 lubridate_1.8.0
[101] DBI_1.1.2 dbplyr_2.1.1
[103] MASS_7.3-55 rappdirs_0.3.3
[105] Matrix_1.4-0 car_3.0-12
[107] cli_3.2.0 glmpca_0.2.0
[109] igraph_1.2.11 pkgconfig_2.0.3
[111] GenomicAlignments_1.30.0 registry_0.5-1
[113] plotly_4.10.0 xml2_1.3.3
[115] foreach_1.5.2 annotate_1.72.0
[117] vipor_0.4.5 rngtools_1.5.2
[119] pkgmaker_0.32.2 webshot_0.5.2
[121] XVector_0.32.0 AnnotationForge_1.36.0
[123] rvest_1.0.2 digest_0.6.29
[125] graph_1.70.0 Biostrings_2.60.2
[127] rmarkdown_2.11 cellranger_1.1.0
[129] dendextend_1.15.2 GSEABase_1.56.0
[131] restfulr_0.0.13 curl_4.3.2
[133] Rsamtools_2.10.0 shiny_1.7.1
[135] rjson_0.2.21 nlme_3.1-155
[137] lifecycle_1.0.1 jsonlite_1.7.3
[139] carData_3.0-5 viridisLite_0.4.0
[141] fansi_1.0.2 pillar_1.7.0
[143] lattice_0.20-45 KEGGREST_1.34.0
[145] fastmap_1.1.0 httr_1.4.2
[147] survival_3.2-13 glue_1.6.1
[149] png_0.1-7 iterators_1.0.14
[151] bit_4.0.4 Rgraphviz_2.36.0
[153] stringi_1.7.6 blob_1.2.2
[155] memoise_2.0.1