topGO "invalid argument to unary operator"
1
0
Entering edit mode
fshodan • 0
@fshodan-14255
Last seen 6.0 years ago
> genesList <- degs$P.Value
> names(genesList) <- degs$ENTREZID
> sampleGOdata <- - new("topGOdata",
+                       description = "Simple session", ontology = "BP",
+                       allGenes = genesList, geneSel = function(x){x<0.01},
+                       nodeSize = 10,
+                       annot = annFUN.org,mapping="org.Hs.eg.db", ID = "entrez")

Building most specific GOs .....
	( 462 GO terms found. )

Build GO DAG topology ..........
	( 2076 GO terms and 4544 relations. )

Annotating nodes ...............
	( 66 genes annotated to the GO terms. )
Error in -new("topGOdata", description = "Simple session", ontology = "BP",  : 
  invalid argument to unary operator

I'm not sure what exactly is wrong, could be older version of the package, namely 1.0.2, while R is 3.4.4. But I don't want to update, because the pipeline is fixed. Any help is much appreciated. I tried to debug, but it's a bit obscure. Below is session information:

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=uk_UA.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=uk_UA.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=uk_UA.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=uk_UA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] org.Hs.eg.db_3.5.0   topGO_2.30.1         SparseM_1.77         GO.db_3.5.0          AnnotationDbi_1.40.0
 [6] IRanges_2.12.0       S4Vectors_0.16.0     Biobase_2.38.0       graph_1.56.0         BiocGenerics_0.24.0 
[11] GOplot_1.0.2         RColorBrewer_1.1-2   gridExtra_2.3        ggdendro_0.1-20      ggplot2_3.1.0       
[16] enrichR_1.0         

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0         pillar_1.3.0       compiler_3.4.4     plyr_1.8.4         bindr_0.1.1       
 [6] tools_3.4.4        digest_0.6.18      bit_1.1-14         lattice_0.20-35    memoise_1.1.0     
[11] RSQLite_2.1.1      tibble_1.4.2       gtable_0.2.0       pkgconfig_2.0.2    rlang_0.3.0.1     
[16] DBI_1.0.0          rstudioapi_0.8     curl_3.2           bindrcpp_0.2.2     withr_2.1.2       
[21] dplyr_0.7.8        httr_1.3.1         bit64_0.9-7        grid_3.4.4         tidyselect_0.2.5  
[26] glue_1.3.0         R6_2.3.0           blob_1.1.1         purrr_0.2.5        magrittr_1.5      
[31] matrixStats_0.54.0 scales_1.0.0       MASS_7.3-50        assertthat_0.2.0   colorspace_1.3-2  
[36] lazyeval_0.2.1     munsell_0.5.0      crayon_1.3.4       rjson_0.2.20      
 
 
software error topgo • 5.5k views
ADD COMMENT
1
Entering edit mode

You will most likely need to give more information than that. Unless someone (probably Adrian Alexa) can reproduce the error you are seeing, it's not usually possible to debug. If your genesList isn't too massive, you could probably just paste the output from dput. But more likely you need to save as an .Rdata file and put somewhere (say DropBox) with a link so people can try to reproduce.

ADD REPLY
0
Entering edit mode
> genesList <- degs$P.Value[1:5]
> names(genesList) <- degs$ENTREZID[1:5]
> genesList
      677818    107985744    102725022    105370787       260294 
3.234676e-04 1.478281e-05 9.095361e-06 2.126379e-03 5.187784e-03 
> sampleGOdata <- - new("topGOdata",
+                       description = "Simple session", ontology = "BP",
+                       allGenes = genesList, geneSel = function(x){x<0.01},
+                       nodeSize = 10,
+                       annot = annFUN.org,mapping="org.Hs.eg.db", ID = "entrez")

Building most specific GOs .....
	( 1 GO terms found. )

Build GO DAG topology ..........
	( 34 GO terms and 52 relations. )

Annotating nodes ...............
	( 1 genes annotated to the GO terms. )
Error in -new("topGOdata", description = "Simple session", ontology = "BP",  : 
  invalid argument to unary operator

> dput(genesList)
structure(c(0.000323467569697372, 1.47828072650949e-05, 9.09536111428312e-06, 0.00212637918363236, 0.0051877836422765), .Names = c("677818", "107985744", "102725022", "105370787", "260294"))

I reduced the data file, will this do for the debugging? Also additional question, when I reduce the file even more (and obviously it cannot find any terms) it also gives an error:

Building most specific GOs .....
	( 0 GO terms found. )

Build GO DAG topology ..........
	( 0 GO terms and 0 relations. )
Error in if (is.na(index) || index < 0 || index > length(nd)) stop("vertex is not in graph: ",  : 
  missing value where TRUE/FALSE needed
ADD REPLY
2
Entering edit mode
@james-w-macdonald-5106
Last seen 9 hours ago
United States

Ugh. I should have looked closer. Note the typo in the following line:

sampleGOdata <- - new("topGOdata",

The assignment operator in R is <- not <- -

ADD COMMENT
0
Entering edit mode

And for completeness

> genesList <- structure(c(0.000323467569697372, 1.47828072650949e-05, 9.09536111428312e-06, 0.00212637918363236, 0.0051877836422765), .Names = c("677818", "107985744", "102725022", "105370787", "260294"))
> genesList
      677818    107985744    102725022    105370787       260294
3.234676e-04 1.478281e-05 9.095361e-06 2.126379e-03 5.187784e-03
> sampleGOdata <- new("topGOdata", description = "Simple session", ontology = "BP",allGenes = genesList, geneSel = function(x){x<0.01}, nodeSize = 10, annot = annFUN.org,mapping="org.Hs.eg.db", ID = "entrez")

Building most specific GOs .....
Loading required package: org.Hs.eg.db

    ( 2 GO terms found. )

Build GO DAG topology ..........
    ( 34 GO terms and 51 relations. )

Annotating nodes ...............
    ( 1 genes annotated to the GO terms. )
>
ADD REPLY
0
Entering edit mode

Haha, super lame mistake, and I thought I double checked the code before writing here... Thanks a lot. However, I think the second one still remains, I mean when I reduce number of genes even more:

> dput(genesList)
structure(c(0.000323467569697372, 1.47828072650949e-05, 9.09536111428312e-06, 
0.00212637918363236), .Names = c("677818", "107985744", "102725022", 
"105370787"))
> sampleGOdata <- new("topGOdata",
+                     description = "Simple session", ontology = "BP",
+                     allGenes = genesList, geneSel = function(x){x<0.01},
+                     nodeSize = 10,
+                     annot = annFUN.org,mapping="org.Hs.eg.db", ID = "entrez")

Building most specific GOs .....
	( 0 GO terms found. )

Build GO DAG topology ..........
	( 0 GO terms and 0 relations. )
Error in if (is.na(index) || index < 0 || index > length(nd)) stop("vertex is not in graph: ",  : 
  missing value where TRUE/FALSE needed
ADD REPLY
0
Entering edit mode

 Right. If you reduce the number of genes to a ridiculously small number, you don't get any significant results. There should probably be some error checking prior to that, but in common usage (and this package has been in BioC for almost 12 years, so has been pretty extensively tested) this isn't probably something that occurs, like ever.

ADD REPLY
0
Entering edit mode

You mean it's super rare that people get zero hits and therefore the error never popped up before?

ADD REPLY
0
Entering edit mode

No, I mean it's super rare that people would try to do a GO test with four genes, to begin with, and further to select four genes that don't have any GO terms.

> z <- structure(c(0.000323467569697372, 1.47828072650949e-05, 9.09536111428312e-06,
0.00212637918363236), .Names = c("677818", "107985744", "102725022",
"105370787"))
> z
      677818    107985744    102725022    105370787
3.234676e-04 1.478281e-05 9.095361e-06 2.126379e-03

> select(org.Hs.eg.db, names(z), "GOALL")
'select()' returned 1:1 mapping between keys and columns
   ENTREZID GOALL EVIDENCEALL ONTOLOGYALL
1    677818  <NA>        <NA>          NA
2 107985744  <NA>        <NA>          NA
3 102725022  <NA>        <NA>          NA
4 105370787  <NA>        <NA>          NA

So if you select four genes, that have no associated GO terms, and you try to do a GO hypergeometric test, is it surprising that it fails? Like I said, there could be some error checking in topGO that looks to make sure that at least one gene has a GO term, but this is a really particular edge case that you won't run into in like 99.9999999% of the time, because who would do a GO hypergeometric with just four genes? That makes no sense.

ADD REPLY
0
Entering edit mode

I just thought it should be less rare, because there are many ontologies, and if user imports smth less mainstream than GO, I can imagine this situation can happen with more than 4 genes. Thanks for the help again!

ADD REPLY

Login before adding your answer.

Traffic: 535 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6