Errors with running function and setting toCompact to TRUE in one.step.pigengene function in Pigengene package
I ran into a problem while running the function, and I got the error message candlist[I,1:2] subscript is out of bounds from running the toy example from the pdf guide. I also get this same error message from using the one.step.pigengene function by setting the parameter bnNum=10 as well. I've tried matching the bnlearn package version in the PDF guide and also tried using the latest version (4.9) and I still get this error.

> library(bnlearn)
> learnt <-, bnPath=file.path(saveDir, "bn"),
+                    bnNum=10, ## In real applications, at least 100-1000.
+                    seed=1, verbose=1) with bnNum= 10 started at:
2023-10-30 22:35:29.907212
Error in candlist[i, 1:2] : subscript out of bounds
In addition: Warning message:
In discretize(, method = "interval", breaks = length(unique(Disease))) :
  at least one variable should be continuous

I also have issues with the setting toCompact=TRUE in the one.step.pigengene function. With the parameter set to FALSE, the function runs fine and is able to generate the c5Trees. I was able to run the compact.tree() function standalone.

p1 <- one.step.pigengene(Data=d1,saveDir='pigengene', bnNum=0, verbose=1,
+                          seed=1, Labels=Labels, toCompact=TRUE, doHeat=FALSE)
Pigengene started analizing 366 samples using 1000 genes...
[1] "dataNum==1"
Pigengene plots in:
Making decision trees...
minPerLeaf: 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37
AML   0   1
MDS   1   0
toCompact: TRUE
Error in get.used.features(c5Tree) : 
  The class of c5Tree argument must be 'C5.0' !

R session info:
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8    LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                           LC_TIME=English_United States.utf8    

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Pigengene_1.28.0    BiocStyle_2.30.0    graph_1.80.0        BiocGenerics_0.48.0

loaded via a namespace (and not attached):
  [1] splines_4.3.1                 later_1.3.1                   bitops_1.0-7                  ggplotify_0.1.2              
  [5] filelock_1.0.2                tibble_3.2.1                  polyclip_1.10-6               preprocessCore_1.64.0        
  [9] rpart_4.1.21                  lifecycle_1.0.3               fastcluster_1.2.3             doParallel_1.0.17            
 [13] lattice_0.21-9                MASS_7.3-60                   backports_1.4.1               magrittr_2.0.3               
 [17] Hmisc_5.1-1                   openxlsx_4.2.5.2              rmarkdown_2.25                yaml_2.3.7                   
 [21] C50_0.1.8                     httpuv_1.6.12                 zip_2.3.0                     cowplot_1.1.1                
 [25] DBI_1.1.3                     RColorBrewer_1.1-3            zlibbioc_1.48.0               purrr_1.0.2                  
 [29] ggraph_2.1.0                  RCurl_1.98-1.12               nnet_7.3-19                   yulab.utils_0.1.0            
 [33] tweenr_2.0.2                  rappdirs_0.3.3                GenomeInfoDbData_1.2.11       IRanges_2.36.0               
 [37] S4Vectors_0.40.1              enrichplot_1.22.0             ggrepel_0.9.4                 tidytree_0.4.5               
 [41] gdata_3.0.0                   reactome.db_1.86.0            pheatmap_1.0.12               codetools_0.2-19             
 [45] DOSE_3.28.0                   ggforce_0.4.1                 tidyselect_1.2.0              aplot_0.2.2                  
 [49] farver_2.1.1                  viridis_0.6.4                 base64enc_0.1-3               dynamicTreeCut_1.63-1        
 [53] matrixStats_1.0.0             stats4_4.3.1                  BiocFileCache_2.10.1          jsonlite_1.8.7               
 [57] ellipsis_0.3.2                tidygraph_1.2.3               Formula_1.2-5                 iterators_1.0.14             
 [61] survival_3.5-7                foreach_1.5.2                 tools_4.3.1                   treeio_1.26.0                
 [65] HPO.db_0.99.2                 Rcpp_1.0.11                   glue_1.6.2                    gridExtra_2.3                
 [69] xfun_0.40                     qvalue_2.34.0                 bnlearn_4.8.1                 GenomeInfoDb_1.38.0          
 [73] dplyr_1.1.3                   withr_2.5.2                   BiocManager_1.30.22           fastmap_1.1.1                
 [77] fansi_1.0.5                   digest_0.6.33                 R6_2.5.1                      mime_0.12                    
 [81] gridGraphics_0.5-1            colorspace_2.1-0              GO.db_3.18.0                  gtools_3.9.4                 
 [85] RSQLite_2.3.2                 inum_1.0-5                    utf8_1.2.4                    tidyr_1.3.0                  
 [89] generics_0.1.3                data.table_1.14.8             htmlwidgets_1.6.2             graphlayouts_1.0.1           
 [93] httr_1.4.7                    scatterpie_0.2.1              graphite_1.48.0               pkgconfig_2.0.3              
 [97] gtable_0.3.4                  blob_1.2.4                    impute_1.76.0                 XVector_0.42.0               
[101] clusterProfiler_4.10.0        shadowtext_0.1.2              htmltools_0.5.6.1             fgsea_1.28.0                 
[105] scales_1.2.1                  Biobase_2.62.0                png_0.1-8                     ggfun_0.1.3                  
[109] knitr_1.45                    rstudioapi_0.15.0             reshape2_1.4.4                checkmate_2.3.0              
[113] nlme_3.1-163                  curl_5.1.0                    cachem_1.0.8                  stringr_1.5.0                
[117] BiocVersion_3.18.0            parallel_4.3.1                HDO.db_0.99.1                 libcoin_1.0-10               
[121] foreign_0.8-85                AnnotationDbi_1.64.0          ReactomePA_1.46.0             pillar_1.9.0                 
[125] grid_4.3.1                    vctrs_0.6.4                   promises_1.2.1                dbplyr_2.4.0                 
[129] cluster_2.1.4                 xtable_1.8-4                  htmlTable_2.4.2               Rgraphviz_2.46.0             
[133] evaluate_0.22                 Cubist_0.4.2.1                mvtnorm_1.2-3                 cli_3.6.1                    
[137] compiler_4.3.1                rlang_1.1.1                   crayon_1.5.2                  plyr_1.8.9                   
[141] fs_1.6.3                      stringi_1.7.12                WGCNA_1.72-1                  viridisLite_0.4.2            
[145] BiocParallel_1.36.0           MPO.db_0.99.7                 munsell_0.5.0                 Biostrings_2.70.1            
[149] lazyeval_0.2.2                GOSemSim_2.28.0               Matrix_1.6-1.1                patchwork_1.1.3              
[153] bit64_4.0.5                   ggplot2_3.4.4                 KEGGREST_1.42.0               shiny_1.7.5.1                
[157] interactiveDisplayBase_1.40.0 AnnotationHub_3.10.0          partykit_1.2-20               igraph_1.5.1                 
[161] memoise_2.0.1                 ggtree_3.10.0                 fastmatch_1.1-4               bit_4.0.5                    
[165] ape_5.7-1                     gson_0.1.0

Thank you for your time and help!

If you want to compact a decision tree, but you do not know which tree, set toCompact=NULL in Versions <1.29.6. In the newer versions, toCompact=TRUE has also the same meaning as toCompact=NULL, while in the older versions, toCompact=TRUE is unappropriate and leads to the above error. Copying from the docs in version 1.29.6:

"toCompact: An integer. The tree with this minPerLeaf value will be compacted (shrunk). Compacting in this context means reducing the number of required genes for the calculation of the relevant eigengenes and making the predictions using the tree. If TRUE or NULL (default), the (persumably) most general proper tree (corresponding to the largest value in the minPerLeaf vector for which a tree could be constructed) is compacted. Set to FALSE to turn off compacting."

I could not reproduce the error when I properly copied the previous commands from the vignette pdf in R. Are you sure you get the pigengene object right? E.g., do you get exactly the same output like below?

> print(pigengene$eigengenes[1:3,1:4])

                   ME1          ME2          ME3          ME0
GSM376049 -0.006958185 -0.002890136 -0.006020223  0.004971246
GSM376050 -0.003067442 -0.017728108  0.017231152 -0.001214384
GSM376051 -0.004466249  0.008799329  0.003679651 -0.002341427 

If so, can you run the command with higher detail, e.g., with verbose=4?

Thank you Dr. Zare for getting back to my inquires.

Below is my full output I also ran the command with higher detail (verbose =4). The pigengene object should be the same as you had shown above. I'm wondering if there is an issue with paths since I am running this on a windows machine?

Part1 of demo Part2 of demo Part3 of demo

Your guess is true. We have not test on Windows. Are you running it on a virtual Ubuntu on Windows? The path shown in the line above the error message is weird. The following can help me to diagnose as I do not have access to a Windows machine:


print(file.path(saveDir, "test"))
print(normalizePath(file.path(saveDir, "test")))
print(file.path(normalizePath(saveDir, "test")))

print(Pigengene:::combinedPath(saveDir, "test"))

Also, can you rerun with verbos=6?

Thank you Dr. Zare. I have WSL2 (Windows Subsystem for Linux) architecture installed. I ran the commands as you mentioned below are the screenshots of the outputs. part 1 part 2

Interestingly, I also tried to change the path of the working directory where the code for the demo was being executed and it seems the function was able to work this time around. I am not sure what changed, as previously, this step was not working even with a different working directory. The only change I noticed that was different from my session info was that the bnlearn package was listed under "other attached packages" instead of the "loaded via a namespace (and not attached)" like above. Not sure if this makes a difference, and if it does how do I ensure the bnlearn package is attached when executing the code? Thank you for your time and help. working part 1 working part 2 working part 3 working part 4 working part 5 working part 6 working - print save dir working part 7 working part 8

The output of print(file.path(saveDir, "test")) is so unexpected that I do not know how to proceed with this mix of Windows and Linux. My suggestion is that you use a real Linux system and if it is not available, use a pure Windows system. Can you do this? If not, I am still willing to help, but that would require much more work, e.g., we can start with rerunning the example with verbose=10 using Pigengene 1.29.10, or later.

The bnlearn package came from "Loaded via a namespace (and not attached)" to "other attached packages" because you probably called it using library(bnlearn) somewhere in your newer R session, but not in the other one.

I think your second attempt worked because you changed your working directory to a normal Windows path (i.e., C:\\Users\\...) rather than the weird \\\\wsl.localhost\\Ubuntu... combo path. Also, there might have been some permission issues in that path.

Thanks Dr. Zare. I will run Pigengene 1.29.10 or later in either an pure linux or pure windows environment.


