Dear all,
I'm having some trouble with setting the background and running the enrichment analysis using the STRINGdb
package.
Specifically, when I set the background and run the enrichment analysis I get a data.frame with appropriate column names, but empty rows.
Running the same script but without the steps setting the background gives an expected result for the enrichment analysis.
Running the STRING analysis and plotting the network seems to give expected results after setting the background.
Unfortunately I'm unable to supply the original data as it's too large to share, but input
mass-spec proteomics data, formatted for input for the MSstats
package.
Please excuse me if I forgot to include some details - this is my first time posting.
Many thanks, Sam
# Initialise STRING
string_db <- STRINGdb$new(version="11.5",
species=9606,
score_threshold=700,
network_type="full",
input_directory="")
# Map whole dataset and use to set background
string_background <- string_db$map(as.data.frame(input), "ProteinName", removeUnmappedRows = TRUE) %>%
.$STRING_id %>%
unique()
string_db$set_background(string_background)
# New database using background
string_db <- STRINGdb$new(version="11.5",
species=9606,
score_threshold=700,
network_type="full",
input_directory="",
backgroundV = string_background)
# Run enrichment
STRING_enrichment <- string_db$get_enrichment(STRING_dataset$STRING_id)
head(STRING_enrichment)
[1] category term number_of_genes
[4] number_of_genes_in_background ncbiTaxonId inputGenes
[7] preferredNames p_value fdr
[10] description
<0 rows> (or 0-length row.names)
sessionInfo( )
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_New Zealand.utf8 LC_CTYPE=English_New Zealand.utf8 LC_MONETARY=English_New Zealand.utf8
[4] LC_NUMERIC=C LC_TIME=English_New Zealand.utf8
attached base packages:
[1] stats4 grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] STRINGdb_2.10.1 org.Hs.eg.db_3.16.0 AnnotationDbi_1.60.2 IRanges_2.32.0 S4Vectors_0.36.2
[6] Biobase_2.58.0 BiocGenerics_0.44.0 clusterProfiler_4.6.2 ComplexHeatmap_2.14.0 EnhancedVolcano_1.16.0
[11] ggrepel_0.9.3 lubridate_1.9.2 forcats_1.0.0 stringr_1.5.0 dplyr_1.1.2
[16] purrr_1.0.1 readr_2.1.4 tidyr_1.3.0 tibble_3.2.1 ggplot2_3.4.2
[21] tidyverse_2.0.0 MSstats_4.6.0