Question: problem with ropls library (OPLS-DA)
0
gravatar for dominik.guggisberg
3.1 years ago by
dominik.guggisberg0 wrote:

May I ask you a question concerning an error message, that I don’t understand.

We have a similar datatset (no missing values) as you described in your manual (data(sacurine)).

When I try the PCA of my matrix (data3.pca<- opls(x))

then everything looks very nice.

 

But, when trying a PLS-DA, then I only receive an error message.

data3.pls<- opls(x, genderFc)   

data3.pls<- opls(x, genderFc)

Error: No model was built because the first predictive component was already not significant; Select a number of predictive components of 1 if you want the algorithm to compute a model despite this.

What could be the reason for this error? Is there any easy way to proceed?

ropls • 867 views
ADD COMMENTlink modified 3.1 years ago by etienne.thevenot20 • written 3.1 years ago by dominik.guggisberg0
Answer: problem with ropls library (OPLS-DA)
0
gravatar for etienne.thevenot
3.1 years ago by
France
etienne.thevenot20 wrote:

Dear Dominik,

By default, ropls automatically selects the optimal number of predictive (PLS) or orthogonal (OPLS) components. To do this, the algorithm checks if the addition of an additional component improves the predictions. Here the message indicates that even the first predictive component was not meaningful, suggesting that the algorithm fails to build a significant PLS model on your dataset. To check this, you can force the algorithm to compute the first components:


data3.pls<- opls(x, genderFc, predI = 2)

You should then observe on the diagnostic plot that the Q2Y value is not significant (i.e. when randomly permuting the response values, the performance of the models are equal or greater than with the true model, meaning that there is overfitting).

The reason is that the meaningful information in your dataset, if any, is too scarce to allow the building of a model (it can be because the number of sampes is too low compared with the number of variables). Did you check the number of significant features by univariate testing followed by multiple testing correction? You can try to add a feature selection step on your training dataset before building the model (we have developed the biosigner package on bioconductor to perform feature selection).

Best wishes,

Etienne.

Note: The algorithms in ropls can cope with a (moderate) amount of missing values.

ADD COMMENTlink written 3.1 years ago by etienne.thevenot20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 93 users visited in the last hour