I am working on data obtained from microarray data. Could anyone enlighten me on how I can group my samples based on 3 particular gene sets.
Hi, sorry, but this isn't much of a question as you have the answer there already.. you would cluster samples using just the gene sets your interested in. Which typically means sub-setting the rows just to contain the genes that are the members of the gene sets you are interested in.
Hi Chris, thank you for your response. I think I could have phrased my question better. I was wondering if I could perform supervised clustering, where I combine all three gene sets (ie, the rows contain genes from all 3 gene sets) and perform clustering. My issue is on how I could ensure that the specified genes are within their respective gene sets when I perform the clustering, as they are not grouped based on expression but molecular hallmarks (or developmental stage). I would like to see how the samples would cluster with all three gene sets in the same dataframe, I am just wondering if this is statistically possible.
You could just aggregate the gene set genes by their mean, if that makes sense... Then you would be clustering gene sets instead of genes.