I have successfully run WGCNA on an expression matrix. WGCNA has also worked for other sets of expression data. I am now trying to run WGCNA on the first PC, PC1-2 and PC1-3 of the same expression matrix. However, I am unable to get past picking soft threshold for the new matrices. Where am I going wrong?
My workflow is as follows (For clarity, I have also included the steps to get the first component, in case I have made a mistake there):
> library(WGCNA) > allowWGCNAThreads() Allowing multi-threading with up to 4 threads. > options(stringsAsFactors = FALSE) #load expression matrix HuSNexpr0 and convert it to a data frame and numeric > head(HuSNexpr0[1:4,1:4]) LOC100132062 LINC01128 LINC01342 TTLL10 V3 8.22364 6.02388 4.75798 5.89833 V4 8.98781 5.39059 4.96581 5.77364 V5 7.46032 5.21021 5.08326 5.95073 V6 8.17156 5.73582 4.46154 5.67360 > dim(HuSNexpr0) [1] 100 20100 #Principal component analysis on the expression matrix > prin_comp <- prcomp(HuSNexpr0, scale. = T) #Isolate first principal component > nCom=1 > First=prin_comp$x[,1:nCom] %*% t(prin_comp$rotation[,1:nCom]) > if(prin_comp$scale !=FALSE){First=scale(First, center=FALSE, scale=1/prin_comp$scale)} Warning message: In if (prin_comp$scale != FALSE) { : the condition has length > 1 and only the first element will be used > if(prin_comp$center !=FALSE){First=scale(First, center=-1*prin_comp$center,scale=FALSE)} Warning message: In if (prin_comp$center != FALSE) { : the condition has length > 1 and only the first element will be used > head(First[1:4,1:4]) LOC100132062 LINC01128 LINC01342 TTLL10 [1,] 7.717886 5.767247 4.752331 5.719690 [2,] 7.702377 5.697647 4.805150 5.769796 [3,] 7.655638 5.487887 4.964335 5.920806 [4,] 7.721896 5.785243 4.738674 5.706734 #the following step may be unnecessary but I was just trying to figure the source of the error. > row.names(First)=c(1:100) > head(First[1:4,1:4]) LOC100132062 LINC01128 LINC01342 TTLL10 1 7.717886 5.767247 4.752331 5.719690 2 7.702377 5.697647 4.805150 5.769796 3 7.655638 5.487887 4.964335 5.920806 4 7.721896 5.785243 4.738674 5.706734 > collectGarbage() > class(First) [1] "matrix" > datExprs0=as.data.frame(First) > gsg=goodSamplesGenes(datExprs0, verbose=3) Flagging genes and samples with too many missing values... ..step 1 > gsg$allOK [1] TRUE > sampleTree = hclust(dist(datExprs0), method = "average") > clust = cutreeStatic(sampleTree, cutHeight = 50, minSize = 45) > table(clust) clust 0 1 3 97 > keepSamples = (clust==1) > datExpr = datExprs0[keepSamples, ] > dim(datExpr) [1] 97 20100 > nGenes = ncol(datExpr) > nSamples = nrow(datExpr) > collectGarbage() > powers=c(c(1:10),seq(from=12,to=20,by=2)) > sft=pickSoftThreshold(datExpr,powerVector=powers,verbose=5) pickSoftThreshold: will use block size 2225. pickSoftThreshold: calculating connectivity for given powers... ..working on genes 1 through 2225 of 20100 ..working on genes 2226 through 4450 of 20100 ..working on genes 4451 through 6675 of 20100 ..working on genes 6676 through 8900 of 20100 ..working on genes 8901 through 11125 of 20100 ..working on genes 11126 through 13350 of 20100 ..working on genes 13351 through 15575 of 20100 ..working on genes 15576 through 17800 of 20100 ..working on genes 17801 through 20025 of 20100 ..working on genes 20026 through 20100 of 20100 Error in summary(lm1)$coefficients[2, 1] : subscript out of bounds
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.13.4
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] WGCNA_1.51 fastcluster_1.1.22 dynamicTreeCut_1.63-1
loaded via a namespace (and not attached):
[1] splines_3.4.0 lattice_0.20-35 colorspace_1.3-2 htmltools_0.3.6
[5] stats4_3.4.0 base64enc_0.1-3 blob_1.1.0 survival_2.41-3
[9] foreign_0.8-68 DBI_0.7 BiocGenerics_0.24.0 bit64_0.9-7
[13] RColorBrewer_1.1-2 matrixStats_0.52.2 foreach_1.4.3 plyr_1.8.4
[17] stringr_1.2.0 munsell_0.4.3 gtable_0.2.0 htmlwidgets_0.8
[21] codetools_0.2-15 memoise_1.1.0 latticeExtra_0.6-28 Biobase_2.38.0
[25] knitr_1.15.1 IRanges_2.12.0 doParallel_1.0.10 parallel_3.4.0
[29] AnnotationDbi_1.40.0 htmlTable_1.9 preprocessCore_1.38.1 Rcpp_0.12.10
[33] acepack_1.4.1 scales_0.4.1 backports_1.1.2 checkmate_1.8.2
[37] S4Vectors_0.16.0 Hmisc_4.0-3 bit_1.1-12 gridExtra_2.2.1
[41] impute_1.50.1 ggplot2_2.2.1 digest_0.6.13 stringi_1.1.5
[45] grid_3.4.0 tools_3.4.0 magrittr_1.5 lazyeval_0.2.0
[49] tibble_1.3.0 RSQLite_2.0 Formula_1.2-1 cluster_2.0.6
[53] GO.db_3.4.1 pkgconfig_2.0.1 Matrix_1.2-10 data.table_1.10.4-3
[57] iterators_1.0.8 rpart_4.1-11 nnet_7.3-12 compiler_3.4.0
Thank you in advance!