Hi,
I am trying figure out how can I answer the question which genes associate with change in the blood concentration of the metabolite of interest following the diet. I have paired samples (before and after the diet), I need to adjust for age, sex and possibly body mass index.
That is an example of my data organization
Sample ID Timepoint Metabolite change_metabolite Sex Age BMI
1a 1 1 10,3000 0,0000 1 56,5 38,1
1b 1 2 20,1000 9,8000 1 57,7 27,2
2a 2 1 11,0000 0,0000 2 21 44
2b 2 2 28,7000 17,7000 2 22,2 25,8
3a 3 1 12,4000 0,0000 1 30 33,1
3b 3 2 65,8000 53,4000 1 31,3 30
4a 4 1 112,0000 0,0000 1 67 31,5
4b 4 2 100,7000 -11,3000 1 68 29,7
5a 5 1 36,2000 0,0000 1 53,5 36,8
5b 5 2 89,1000 52,9000 1 54,5 32,9
6a 6 1 12,9000 0,0000 2 25,7 40,4
6b 6 2 29,0000 16,1000 2 26,7 37,6
7a 7 1 15,1000 0,0000 2 44,8 35,7
7b 7 2 98,2000 83,1000 2 45,9 23,1
8a 8 1 8,0000 0,0000 1 25,4 29,9
8b 8 2 11,5600 3,5600 1 26,6 24,8
On top of that my data are coming from 3 different health_centers, so I need to adjust for that too.
How can I do it with edgeR? Does it make any sense?
y <- DGEList(counts=expr)
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y)
design <- model.matrix(~ ID+ Timepoint+ Sex + Age + BMI + health_center+ change_metabolite)
yf <- estimateDisp(y, as.data.frame(design), robust=TRUE)
fit <- glmQLFit(yf, design)
qlf <- glmQLFTest(fit, coef="change_metabolite")
topTags(qlf)
Thanks for your suggestions!
Thanks a lot Gordon!