Question: Age covariate continuous vs. categorical
0
gravatar for L_K
17 months ago by
L_K0
L_K0 wrote:

Dear Bioconductor community

 

I have a question regarding the usage of age as a covariate. As proposed multiple times I tried to categorize the age covariate in order to account for it. However, as I have a rather small sample size (3 groups, n=8,n=5,n=6) it turns out that it is pretty hard to find the right way/step to cut the ages. As I tried initally 4 categories and it ended up being very unbalanced between the experimental conditions, I tried cutting with 3 breaks. You can find the resulting frequencies below:

3 breaks:

4 breaks:

As you can see there is always a pretty severe imbalance between the age categories and the experimental conditions.

So know I really do not know what to do. There are multiple options: Use age as categorical covariate (I still don't know how many breaks would be reasonable), use age as a continuous covariate (this is not suggested), don't account for age (might be ok, since we are investigating a late-onset disease and all individuals are over the critical age), or don't account for age and use SVA (not sure about that one, if I do that I get a significant surrogate variable that correlates with age with a coefficient of -0.45...).

Below you can find the distribution of ages (or birth years respectively between the experimental conditions (y axis)

I would really appreciate your help.


Thanks a lot

-Matt

ADD COMMENTlink modified 17 months ago by Michael Love24k • written 17 months ago by L_K0
Answer: Age covariate continuous vs. categorical
2
gravatar for Michael Love
17 months ago by
Michael Love24k
United States
Michael Love24k wrote:

You can add age as a continuous covariate, but keep in mind that, e.g. ~age + ... implies that gene expression will have multiplicative increases with each unit of age.

By the way, I'd recommend to actually put the age in the model rather than birth year, it's much more interpretable this way, and doesn't lead to weird changes to the intercept because one of the covariates has a range from e.g. 1985-2000

ADD COMMENTlink written 17 months ago by Michael Love24k

Thank you very much for your valuable answer. I'll fix the age/Year of Birth thing.

However, my limited statistical knowledge doesn't allow me to understand your remark regarding the multiplicative effect of a continuous age covariate. Would it be possible to quickly elaborate on this?  Thank you.

ADD REPLYlink written 17 months ago by L_K0

Check the vignette section on the statistical model of DESeq2 (or it's also in the first section of the Results of the DESeq2 paper).

If you have a column of x that gives the age, and then a coefficient beta that you multiply with the age column (as well as the others columns of x and their respective betas) this gives you the log2 of expression. This implies that you have multiplicative increases in expression with increases in age.

ADD REPLYlink written 17 months ago by Michael Love24k

Ok, got it. Thanks a lot for your time!

ADD REPLYlink written 17 months ago by L_K0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 266 users visited in the last hour