My name is Mahes Muniandy and I am a doctoral student working on twin data. I have used limma previously for gene expression as well as methylation data and would now like to use it on my metabolomics data - to detect within twin-pair differences. I have two quesions:
1) Since metabolite data has high multicolinearity, can I safely use my metabolite data as is or should I reduce the data to PCAs and then use the PCA values to represent my metabolites in my limma model?
Here is my design matrix:
design <- model.matrix(~Pair+Smoking+subcutaneousfat+bmi)
fit <- lmFit(metabolites, design)
2) In my paired samples design, I am using a continuous variable (subcutaneousfat) which might or might not be discordant within the twin-pairs - I think this is not a problem but should my data be sorted in a way that the higher subcutaneousfat is always compared against the lower one - or does the model care?
Finnish Institute of Molecular Medicine,