Question

coseq plotting error: no method for coercing this S4 class to a vector

0

Entering edit mode

ajwijeratne1 • 0

@ajwijeratne1-10100

Last seen 3.1 years ago

United States

I'm using Coseq to cluster my RNAseq data processed with DESeq. There is a similar error mentioned in coseq plotting error: no method for coercing this S4 class to a vector but I tried to start to finish without loading .Rdata. I am getting an error in the plot function and can't seem to find a fix for it.

set.seed(1)

DE_CUF_runLogCLR <- coseq(DEseq_DE_CUF_normalized, K=2:25, transformation="logclr",norm="none", 
                   meanFilterCutoff=10, model="kmeans",
                   nstart=1, iter.max=15)

###
****************************************
coseq analysis: kmeans approach & logclr transformation
K = 2 to 25 
Use set.seed() prior to running coseq for reproducible results.
****************************************
Running K = 2 ...
Running K = 3 ...
Running K = 4 ...
Running K = 5 ...
Running K = 6 ...
Running K = 7 ...
Running K = 8 ...
Running K = 9 ...
Running K = 10 ...
Running K = 11 ...
Running K = 12 ...
Running K = 13 ...
Running K = 14 ...
Running K = 15 ...
Running K = 16 ...
Running K = 17 ...
Running K = 18 ...
Running K = 19 ...
Running K = 20 ...
Running K = 21 ...
Running K = 22 ...
Running K = 23 ...
Running K = 24 ...
Running K = 25 ...


summary(DE_CUF_runLogCLR)

*************************************************
K-means algorithm
Transformation: logclr
*************************************************
Clusters fit: 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25
Clusters with errors: 4,6,7,8,9,10,12,13,14,15,16,17,18,19,20,21,22,23,24,25
Selected number of clusters via capushe: 10
*************************************************
Number of clusters = 10
*************************************************
Cluster sizes:
 Cluster 1  Cluster 2  Cluster 3  Cluster 4  Cluster 5  Cluster 6  Cluster 7  Cluster 8  Cluster 9 Cluster 10 
       132       2200        107        802         28       2361        275         11         74         24

plot(DE_CUF_runLogCLR, graphs="boxplots")

Error in as.vector(x, mode = "numeric") : no method for coercing this S4 class to a vector

> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS

Matrix products: default
BLAS:   /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] grid      parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] circlize_0.4.13             gridExtra_2.3               sva_3.35.2                  genefilter_1.68.0           mgcv_1.8-38                
 [6] nlme_3.1-155                DESeq2_1.26.0               ggvenn_0.1.9                GGally_2.1.2                ggrepel_0.9.1              
[11] plotly_4.10.0               corrplot_0.92               ComplexHeatmap_2.2.0        forcats_0.5.1               tidyverse_1.3.1            
[16] purrr_0.3.4                 ggplot2_3.3.5               stringr_1.4.0               readr_2.1.1                 tidyr_1.1.4                
[21] dplyr_1.0.7                 tibble_3.1.6                statmod_1.4.36              coseq_1.10.0                SummarizedExperiment_1.16.1
[26] DelayedArray_0.12.3         BiocParallel_1.20.1         matrixStats_0.61.0          Biobase_2.46.0              GenomicRanges_1.38.0       
[31] GenomeInfoDb_1.22.1         IRanges_2.20.2              S4Vectors_0.24.4            BiocGenerics_0.32.0         edgeR_3.28.1               
[36] limma_3.42.2               

loaded via a namespace (and not attached):
  [1] readxl_1.3.1           backports_1.4.1        Hmisc_4.6-0            plyr_1.8.6             HTSCluster_2.0.8       lazyeval_0.2.2        
  [7] splines_3.6.0          digest_0.6.29          htmltools_0.5.2        fansi_1.0.2            magrittr_2.0.2         checkmate_2.0.0       
 [13] memoise_2.0.1          cluster_2.1.2          tzdb_0.2.0             annotate_1.64.0        modelr_0.1.8           bayesm_3.1-4          
 [19] jpeg_0.1-9             colorspace_2.0-2       blob_1.2.2             rvest_1.0.2            haven_2.4.3            xfun_0.29             
 [25] crayon_1.4.2           RCurl_1.98-1.5         jsonlite_1.7.3         survival_3.2-13        glue_1.6.1             gtable_0.3.0          
 [31] zlibbioc_1.32.0        XVector_0.26.0         compositions_2.0-4     GetoptLong_1.0.5       shape_1.4.6            DEoptimR_1.0-10       
 [37] scales_1.1.1           DESeq_1.38.0           mvtnorm_1.1-3          DBI_1.1.2              Rcpp_1.0.8             plotrix_3.8-2         
 [43] viridisLite_0.4.0      xtable_1.8-4           htmlTable_2.4.0        clue_0.3-60            foreign_0.8-71         bit_4.0.4             
 [49] proxy_0.4-26           Formula_1.2-4          htmlwidgets_1.5.4      httr_1.4.2             RColorBrewer_1.1-2     ellipsis_0.3.2        
 [55] reshape_0.8.8          pkgconfig_2.0.3        XML_3.99-0.3           nnet_7.3-17            dbplyr_2.1.1           locfit_1.5-9.4        
 [61] utf8_1.2.2             tidyselect_1.1.1       rlang_0.4.12           AnnotationDbi_1.48.0   munsell_0.5.0          cellranger_1.1.0      
 [67] tools_3.6.0            cachem_1.0.6           cli_3.1.1              generics_0.1.1         RSQLite_2.2.9          broom_0.7.11          
 [73] evaluate_0.14          fastmap_1.1.0          yaml_2.2.2             knitr_1.37             bit64_4.0.5            fs_1.5.2              
 [79] robustbase_0.93-9      xml2_1.3.3             compiler_3.6.0         rstudioapi_0.13        png_0.1-7              e1071_1.7-9           
 [85] reprex_2.0.1           geneplotter_1.64.0     stringi_1.7.6          lattice_0.20-45        Matrix_1.4-0           tensorA_0.36.2        
 [91] vctrs_0.3.8            pillar_1.6.5           capushe_1.1.1          lifecycle_1.0.1        BiocManager_1.30.16    HTSFilter_1.26.0      
 [97] GlobalOptions_0.1.2    data.table_1.14.2      bitops_1.0-7           Rmixmod_2.1.6          R6_2.5.1               latticeExtra_0.6-29   
[103] MASS_7.3-55            assertthat_0.2.1       rjson_0.2.20           withr_2.4.3            GenomeInfoDbData_1.2.2 hms_1.1.1             
[109] rpart_4.1-15           class_7.3-20           rmarkdown_2.11         lubridate_1.8.0        base64enc_0.1-3

coseq • 1.7k views

ADD COMMENT • link updated 3.1 years ago by andrea.rau ▴ 80 • written 3.1 years ago by ajwijeratne1 • 0

0

Entering edit mode

Hi @ajwijeratne1, thanks for your question. I'll need a little bit more input before I can answer your question since I can't reproduce your error on my side.

First, I'm surprised to see so many clusters with errors -- as shown in the output of summary(DE_CUF_runLogCLR), very few clusters had no errors (it looks like only K = 2,3,5,11 completed without error). This point isn't very well documented in the vignette, but the models with different cluster numbers are reported as having an error if the ifault flag is raised in kmeans. This makes me wonder whether the input you provided to coseq (DEseq_DE_CUF_normalized) is well-adapted to using the logclr transformation.

Could you confirm what the analysis pipeline was leading up to the creation of DEseq_DE_CUF_normalized? What do the data look like? How many genes and samples? Based on the name, I am assuming you used DESeq2 to perform normalization and differential analyses -- are these normalized counts or log-normalized counts?

Also, I noticed that you're using an older version of coseq, the current version is 1.18.0. Do you have the possibility to update your version of R and use the latest Bioconductor version?

Best, AR

ADD REPLY • link 3.1 years ago andrea.rau ▴ 80

0

Entering edit mode

Thanks for your reply, Andrea! I have updated the package and the error for plotting has gone now - thank you very much! For the cluster number error, I have used EdgeR to normalized data (TMM). However, these are differentially expressed genes using DEseq2. I have 13 samples three treatments and two batches. I have used Combat_seq to remove batch effect prior to differential gene expression call. If it is easier, I can send you the dataset via email to take a look.

ADD REPLY • link 3.1 years ago ajwijeratne1 • 0

0

Entering edit mode

Great news for the plot!

Are you still getting a lot of clusters with errors even with the new version of the package? And if so, is the same thing happening with a different transformation (e.g., transformation = "clr"?) If so, it would be great if you could forward the dataset via email so that I can do some debugging and try to understand what is going on.

ADD REPLY • link 3.1 years ago andrea.rau ▴ 80

0

Entering edit mode

Yes, tried different clustering and also dropped one batch, but still the same with errors in clusters. I will send my code and dataset via email.

ADD REPLY • link 3.1 years ago ajwijeratne1 • 0

0

Entering edit mode

After an exchange by email, it seems that the K-means algorithm just needed to run for a larger number of iterations to converge (you used iter.max = 10, which I had used in the vignette simply to shorten run times in the illustration). By default, coseq caps the maximum number of iterations at iter.max = 50, and the following code runs for me with far fewer cluster errors:

DE_CUF_runLogCLR <- coseq(DEseq_DE_CUF_normalized, K=2:25, transformation="logclr",norm="none", 
                   meanFilterCutoff=10, model="kmeans")

ADD REPLY • link 3.1 years ago andrea.rau ▴ 80

score 0 · Answer 1 · 2022-03-02

The plotting error in the original question was resolved by updating coseq to a more recent version. The large number of clusters with errors indicated in the summary() appears to be due to the small value of iter.max=10 used for the K-means algorithm called by coseq (NB: this value was specified in the vignette code for a faster run time in the illustration, and should not generally be used in practice). Using the default value for iter.max (50) yields far fewer clusters with errors in this case.