Question: STRING_id mismatch with STRINGdb
0
gravatar for Samuel Lee
4 months ago by
Samuel Lee0
Melbourne
Samuel Lee0 wrote:

Hi, I'm trying to use STRINGdb for parsing the STRING PPI network in R. However, I've run into an issue where the get_neighbors() method fails with the following error Error in as.igraph.vs(graph, v) : Invalid vertex names. It seems non of the mapped STRING_id's are included in the graph despite the network being able to be plotted.

My end goal is to be able to programmatically retrieve neighbors (and their interactions) of an arbitrary path length from specified genes.

Any insight as to how I can fix this, or if there is an alternative method I should use, would be appreciated.

library(STRINGdb)

sbd <- STRINGdb$new(
  version = "10",
  species = 9606, 
  score_threshold = 0, 
  input_directory = "A:/~~~~~~~"
  )

data(diff_exp_example1)
# example data from STRINGdb vignette 

example1_mapped <- sbd$map(diff_exp_example1, "gene", removeUnmappedRows = TRUE )

dim(example1_mapped)
# [1] 17748     4

sbd$plot_network(example1_mapped$STRING_id[1:100])
# This works, 100 vertices, 106 edges

sbd$get_neighbors(example1_mapped$STRING_id[1:100])
# Error in as.igraph.vs(graph, v) : Invalid vertex names

sgrph <- sbd$get_graph()

length(igraph::V(sgrph))
# [1] 19247

sum(example1_mapped$STRING_id %in% igraph::V(sgrph))
# [1] 0

sessionInfo()
# R version 3.5.3 (2019-03-11)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows >= 8 x64 (build 9200)
# 
# Matrix products: default
# 
# locale:
#   [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
# [4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    
# 
# attached base packages:
#   [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
#   [1] STRINGdb_1.22.0
# 
# loaded via a namespace (and not attached):
#   [1] igraph_1.2.4       hash_2.2.6.1       Rcpp_1.0.1         magrittr_1.5       bit_1.1-14         blob_1.1.1        
# [7] plyr_1.8.4         caTools_1.17.1.2   tools_3.5.3        png_0.1-7          plotrix_3.7-4      KernSmooth_2.23-15
# [13] DBI_1.0.0          gtools_3.8.1       yaml_2.2.0         bit64_0.9-7        digest_0.6.18      RColorBrewer_1.1-2
# [19] bitops_1.0-6       RCurl_1.95-4.12    memoise_1.1.0      RSQLite_2.1.1      gsubfn_0.7         gdata_2.18.0      
# [25] compiler_3.5.3     gplots_3.0.1.1     chron_2.3-53       sqldf_0.4-11       proto_1.0.0        pkgconfig_2.0.2 
stringdb igraph • 107 views
ADD COMMENTlink written 4 months ago by Samuel Lee0

I realised that the line

sum(example1_mapped$STRING_id %in% igraph::V(sgrph))

should be

sum(example1_mapped$STRING_id %in% get.vertex.attribute(sgrph, "name"))
# [1] 17528

which does show that the vertex names are in fact comparable... sadly it doesn't get me any closer to working out why the get_neighbors() method fails.

ADD REPLYlink written 4 months ago by Samuel Lee0

Seems that igraph crashes when one of the requested nodes (proteins) is not connected to anything (is not part of the graph) The solution would be to query get_neighbors method with one node at a time and use try/catch around the call.

We are in a process of updating the whole package. In the next release this problem should be solved.

ADD REPLYlink modified 4 months ago • written 4 months ago by damian.szk20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 99 users visited in the last hour