Dear Bioconductor community
I have a question regarding the usage of age as a covariate. As proposed multiple times I tried to categorize the age covariate in order to account for it. However, as I have a rather small sample size (3 groups, n=8,n=5,n=6) it turns out that it is pretty hard to find the right way/step to cut the ages. As I tried initally 4 categories and it ended up being very unbalanced between the experimental conditions, I tried cutting with 3 breaks. You can find the resulting frequencies below:
As you can see there is always a pretty severe imbalance between the age categories and the experimental conditions.
So know I really do not know what to do. There are multiple options: Use age as categorical covariate (I still don't know how many breaks would be reasonable), use age as a continuous covariate (this is not suggested), don't account for age (might be ok, since we are investigating a late-onset disease and all individuals are over the critical age), or don't account for age and use SVA (not sure about that one, if I do that I get a significant surrogate variable that correlates with age with a coefficient of -0.45...).
Below you can find the distribution of ages (or birth years respectively between the experimental conditions (y axis)
I would really appreciate your help.
Thanks a lot