I would like to use the SigCheck package to check the effect of a gene signature on phenotypic data of a dataset of mine. However, I am not sure I understood how to indicate which one are the genes that are over-expressed and those that are instead underexpressedcin in my signature. Could you please elaborate a little bit on this topic? I apologize in advance if this was already specified in the manual.
You can control how the gene signature is related to the expression values using the scoreMethod parameter when calling sigCheck().
By default this is set to scoreMethod=PCA, which uses the value of the first principal component value for each sample using the expression of the signature genes for that sample. You can also set scoreMethod=High, which will take the mean expression value across all the signature genes for each sample.
If you want to derive a more complex score where you expect some of the genes in the signature to be upregulated and some downregulated, you can write your own function and specify it as the value for the scoreMethod parameter. This function should accept an ExpressionSet with rows corresponding to each feature in the signature and a column for each sample, and should return a vector of scores. For survival analysis purposes, the scores are used to divide the samples into patient groups using the threshold parameter.
This is discussed in the Vignette in Section 3.3: Scoring methods for dividing sample into survival groups, and in the manual page for the sigCheck() function.