Search
Question: topGO "invalid argument to unary operator"
0
gravatar for fshodan
27 days ago by
fshodan0
fshodan0 wrote:
> genesList <- degs$P.Value
> names(genesList) <- degs$ENTREZID
> sampleGOdata <- - new("topGOdata",
+                       description = "Simple session", ontology = "BP",
+                       allGenes = genesList, geneSel = function(x){x<0.01},
+                       nodeSize = 10,
+                       annot = annFUN.org,mapping="org.Hs.eg.db", ID = "entrez")

Building most specific GOs .....
	( 462 GO terms found. )

Build GO DAG topology ..........
	( 2076 GO terms and 4544 relations. )

Annotating nodes ...............
	( 66 genes annotated to the GO terms. )
Error in -new("topGOdata", description = "Simple session", ontology = "BP",  : 
  invalid argument to unary operator

I'm not sure what exactly is wrong, could be older version of the package, namely 1.0.2, while R is 3.4.4. But I don't want to update, because the pipeline is fixed. Any help is much appreciated. I tried to debug, but it's a bit obscure. Below is session information:

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=uk_UA.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=uk_UA.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=uk_UA.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=uk_UA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] org.Hs.eg.db_3.5.0   topGO_2.30.1         SparseM_1.77         GO.db_3.5.0          AnnotationDbi_1.40.0
 [6] IRanges_2.12.0       S4Vectors_0.16.0     Biobase_2.38.0       graph_1.56.0         BiocGenerics_0.24.0 
[11] GOplot_1.0.2         RColorBrewer_1.1-2   gridExtra_2.3        ggdendro_0.1-20      ggplot2_3.1.0       
[16] enrichR_1.0         

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0         pillar_1.3.0       compiler_3.4.4     plyr_1.8.4         bindr_0.1.1       
 [6] tools_3.4.4        digest_0.6.18      bit_1.1-14         lattice_0.20-35    memoise_1.1.0     
[11] RSQLite_2.1.1      tibble_1.4.2       gtable_0.2.0       pkgconfig_2.0.2    rlang_0.3.0.1     
[16] DBI_1.0.0          rstudioapi_0.8     curl_3.2           bindrcpp_0.2.2     withr_2.1.2       
[21] dplyr_0.7.8        httr_1.3.1         bit64_0.9-7        grid_3.4.4         tidyselect_0.2.5  
[26] glue_1.3.0         R6_2.3.0           blob_1.1.1         purrr_0.2.5        magrittr_1.5      
[31] matrixStats_0.54.0 scales_1.0.0       MASS_7.3-50        assertthat_0.2.0   colorspace_1.3-2  
[36] lazyeval_0.2.1     munsell_0.5.0      crayon_1.3.4       rjson_0.2.20      
 
 
ADD COMMENTlink modified 27 days ago by James W. MacDonald48k • written 27 days ago by fshodan0
1

You will most likely need to give more information than that. Unless someone (probably Adrian Alexa) can reproduce the error you are seeing, it's not usually possible to debug. If your genesList isn't too massive, you could probably just paste the output from dput. But more likely you need to save as an .Rdata file and put somewhere (say DropBox) with a link so people can try to reproduce.

ADD REPLYlink written 27 days ago by James W. MacDonald48k
> genesList <- degs$P.Value[1:5]
> names(genesList) <- degs$ENTREZID[1:5]
> genesList
      677818    107985744    102725022    105370787       260294 
3.234676e-04 1.478281e-05 9.095361e-06 2.126379e-03 5.187784e-03 
> sampleGOdata <- - new("topGOdata",
+                       description = "Simple session", ontology = "BP",
+                       allGenes = genesList, geneSel = function(x){x<0.01},
+                       nodeSize = 10,
+                       annot = annFUN.org,mapping="org.Hs.eg.db", ID = "entrez")

Building most specific GOs .....
	( 1 GO terms found. )

Build GO DAG topology ..........
	( 34 GO terms and 52 relations. )

Annotating nodes ...............
	( 1 genes annotated to the GO terms. )
Error in -new("topGOdata", description = "Simple session", ontology = "BP",  : 
  invalid argument to unary operator

> dput(genesList)
structure(c(0.000323467569697372, 1.47828072650949e-05, 9.09536111428312e-06, 0.00212637918363236, 0.0051877836422765), .Names = c("677818", "107985744", "102725022", "105370787", "260294"))

I reduced the data file, will this do for the debugging? Also additional question, when I reduce the file even more (and obviously it cannot find any terms) it also gives an error:

Building most specific GOs .....
	( 0 GO terms found. )

Build GO DAG topology ..........
	( 0 GO terms and 0 relations. )
Error in if (is.na(index) || index < 0 || index > length(nd)) stop("vertex is not in graph: ",  : 
  missing value where TRUE/FALSE needed
ADD REPLYlink modified 27 days ago • written 27 days ago by fshodan0
2
gravatar for James W. MacDonald
27 days ago by
United States
James W. MacDonald48k wrote:

Ugh. I should have looked closer. Note the typo in the following line:

sampleGOdata <- - new("topGOdata",

The assignment operator in R is <- not <- -

ADD COMMENTlink written 27 days ago by James W. MacDonald48k

And for completeness

> genesList <- structure(c(0.000323467569697372, 1.47828072650949e-05, 9.09536111428312e-06, 0.00212637918363236, 0.0051877836422765), .Names = c("677818", "107985744", "102725022", "105370787", "260294"))
> genesList
      677818    107985744    102725022    105370787       260294
3.234676e-04 1.478281e-05 9.095361e-06 2.126379e-03 5.187784e-03
> sampleGOdata <- new("topGOdata", description = "Simple session", ontology = "BP",allGenes = genesList, geneSel = function(x){x<0.01}, nodeSize = 10, annot = annFUN.org,mapping="org.Hs.eg.db", ID = "entrez")

Building most specific GOs .....
Loading required package: org.Hs.eg.db

    ( 2 GO terms found. )

Build GO DAG topology ..........
    ( 34 GO terms and 51 relations. )

Annotating nodes ...............
    ( 1 genes annotated to the GO terms. )
>
ADD REPLYlink written 27 days ago by James W. MacDonald48k

Haha, super lame mistake, and I thought I double checked the code before writing here... Thanks a lot. However, I think the second one still remains, I mean when I reduce number of genes even more:

> dput(genesList)
structure(c(0.000323467569697372, 1.47828072650949e-05, 9.09536111428312e-06, 
0.00212637918363236), .Names = c("677818", "107985744", "102725022", 
"105370787"))
> sampleGOdata <- new("topGOdata",
+                     description = "Simple session", ontology = "BP",
+                     allGenes = genesList, geneSel = function(x){x<0.01},
+                     nodeSize = 10,
+                     annot = annFUN.org,mapping="org.Hs.eg.db", ID = "entrez")

Building most specific GOs .....
	( 0 GO terms found. )

Build GO DAG topology ..........
	( 0 GO terms and 0 relations. )
Error in if (is.na(index) || index < 0 || index > length(nd)) stop("vertex is not in graph: ",  : 
  missing value where TRUE/FALSE needed
ADD REPLYlink modified 27 days ago • written 27 days ago by fshodan0

 Right. If you reduce the number of genes to a ridiculously small number, you don't get any significant results. There should probably be some error checking prior to that, but in common usage (and this package has been in BioC for almost 12 years, so has been pretty extensively tested) this isn't probably something that occurs, like ever.

ADD REPLYlink written 27 days ago by James W. MacDonald48k

You mean it's super rare that people get zero hits and therefore the error never popped up before?

ADD REPLYlink written 27 days ago by fshodan0

No, I mean it's super rare that people would try to do a GO test with four genes, to begin with, and further to select four genes that don't have any GO terms.

> z <- structure(c(0.000323467569697372, 1.47828072650949e-05, 9.09536111428312e-06,
0.00212637918363236), .Names = c("677818", "107985744", "102725022",
"105370787"))
> z
      677818    107985744    102725022    105370787
3.234676e-04 1.478281e-05 9.095361e-06 2.126379e-03

> select(org.Hs.eg.db, names(z), "GOALL")
'select()' returned 1:1 mapping between keys and columns
   ENTREZID GOALL EVIDENCEALL ONTOLOGYALL
1    677818  <NA>        <NA>          NA
2 107985744  <NA>        <NA>          NA
3 102725022  <NA>        <NA>          NA
4 105370787  <NA>        <NA>          NA

So if you select four genes, that have no associated GO terms, and you try to do a GO hypergeometric test, is it surprising that it fails? Like I said, there could be some error checking in topGO that looks to make sure that at least one gene has a GO term, but this is a really particular edge case that you won't run into in like 99.9999999% of the time, because who would do a GO hypergeometric with just four genes? That makes no sense.

ADD REPLYlink written 27 days ago by James W. MacDonald48k

I just thought it should be less rare, because there are many ontologies, and if user imports smth less mainstream than GO, I can imagine this situation can happen with more than 4 genes. Thanks for the help again!

ADD REPLYlink written 27 days ago by fshodan0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 362 users visited in the last hour