I am trying to use the prediction method within the package CMA for predicting the multiple classes (4 classes) of a set of 32 ‘unknown’ samples. I have no problems when I use the ‘multiclass’ scheme for gene selection, but I encounter the problem illustrated below when I use either the ‘pairwise’ or ‘one-vs-all’ scheme during gene selection. When the GeneSelection method is run in the context of multiple classes, specifying either the ‘pairwise’ or ‘one-vs-all’ scheme, then by definition the class genesel object returned will contain more than one variable because it will contain a list of genes for each pairwise comparison. The ‘prediction method’ summarised in the 25.11.19 version of the CMA manual, and the prediction method documentation returned by ?prediction, suggest that class genesel objects created when providing ‘scheme=pairwise’ or ‘scheme=one-vs-all’ as an argument for the GeleSelection method should be acceptable as an argument for the prediction method, but the error message I obtain below suggests that this may not be the case.
library(CMA)
Objects created to enter 4-class training gene expression set data, 4-class training set sample class information, and test gene expression set data
class(trainpcrdata) [1] "matrix"
class(fclasses) [1] "factor"
class(testpcrdata) [1] "matrix"
Create a class learningsets object with only 1 iteration:
TempLS261119<-GenerateLearningsets(y=fclasses, method = "MCCV", niter = 1, ntrain = floor(1*length(fclasses)))
Create a class genesel object using the ‘1 iteration’ class learningsets object and the required GeneSelection method and required scheme for non-binary classification:
rfeGS261119p<-GeneSelection(X=trainpcrdata, y=fclasses, learningsets=TempLS261119, method="rfe", scheme="pairwise", trace=TRUE) GeneSelection: iteration 1 GeneSelection: iteration 1 GeneSelection: iteration 1 GeneSelection: iteration 1 GeneSelection: iteration 1 GeneSelection: iteration 1
Create a class tuneres object for the required GeneSelection method, the required number of genes, and the required classification method:
compBrfe10Tp<-tune(X=trainpcrdata, y=fclasses, learningsets= TempLS261119, genesel=rfeGS_261119p, nbgene=10, classifier=compBoostCMA)
Predict the class of the 32 ‘unknown’ samples in the matrix testpcrdata:
compBrfe10predp<-predictionX.tr=trainpcrdata, y.tr=fclasses, X.new=testpcrdata, classifier=compBoostCMA, genesel=rfeGS261119p, nbgene=10, tuneres= compBrfe10Tp) Error in predictionX.tr = trainpcrdata, y.tr = fclasses, X.new = testpcrdata, : GeneSelection object contains more than one variable selection.
sessionInfo() R version 3.6.0 (2019-04-26) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LCCOLLATE=EnglishUnited Kingdom.1252
[2] LCCTYPE=EnglishUnited Kingdom.1252
[3] LCMONETARY=EnglishUnited Kingdom.1252
[4] LCNUMERIC=C
[5] LCTIME=English_United Kingdom.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages: [1] CMA1.42.0 Biobase2.44.0 BiocGenerics0.30.0 [4] e10711.7-2
loaded via a namespace (and not attached): [1] compiler3.6.0 class7.3-15
