I am analysing an RRBS experiment where my conditions are a mixture of three compounds. These compounds sum to one, so A + B + C = 1, and my conditions are various percentages of A, B and C.
The statisticians who usually analyze clinical data using these models have been recommended to use models from the
mixexp package, specifically the linear and quadratic models.
This means I want to compare three models:
model_null <- ~ 1 model_linear <- ~ 0 + A + B + C model_quadratic <- ~ 0 + A + B + C + A:B + A:C + B:C
I suppose that for comparing the quadratic to the linear model, I can really just look at whether the interaction terms are significant, but for comparing the linear to the null model is not possible by testing whether A, B and C are significant since there is no intercept.
As I understand it, this model better handles to covariance between the three compounds than a regular
~A + B model would.
My goal is to select sites where the mixture affects the methylation level, and then afterwards work with the estimated coefficients, so I don't need to be able to test the significance of individual compounds. I really only want to detect if the site is affected at all.
Is there some way to select significant sites using such a model? Maybe fitting a model of the form
~A + B, selecting sites that are significant and then estimating the coefficients using a
~0 + A + B + C model?
I realize that there might not be a best model for all sites, but I was hoping to select one model that fit most sites and use that for all sites.