Differential gene expression associated with change in blood parameter following diet
1
1
Entering edit mode
@annacotannacot-20795
Last seen 11 months ago
United States

Hi,

I am trying figure out how can I answer the question which genes associate with change in the blood concentration of the metabolite of interest following the diet. I have paired samples (before and after the diet), I need to adjust for age, sex and possibly body mass index.

That is an example of my data organization

Sample  ID  Timepoint   Metabolite  change_metabolite   Sex Age BMI
1a  1   1   10,3000 0,0000  1   56,5    38,1
1b  1   2   20,1000 9,8000  1   57,7    27,2
2a  2   1   11,0000 0,0000  2   21  44
2b  2   2   28,7000 17,7000 2   22,2    25,8
3a  3   1   12,4000 0,0000  1   30  33,1
3b  3   2   65,8000 53,4000 1   31,3    30
4a  4   1   112,0000    0,0000  1   67  31,5
4b  4   2   100,7000    -11,3000    1   68  29,7
5a  5   1   36,2000 0,0000  1   53,5    36,8
5b  5   2   89,1000 52,9000 1   54,5    32,9
6a  6   1   12,9000 0,0000  2   25,7    40,4
6b  6   2   29,0000 16,1000 2   26,7    37,6
7a  7   1   15,1000 0,0000  2   44,8    35,7
7b  7   2   98,2000 83,1000 2   45,9    23,1
8a  8   1   8,0000  0,0000  1   25,4    29,9
8b  8   2   11,5600 3,5600  1   26,6    24,8

On top of that my data are coming from 3 different health_centers, so I need to adjust for that too.

How can I do it with edgeR? Does it make any sense?

y <- DGEList(counts=expr)
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y)
design <- model.matrix(~ ID+ Timepoint+ Sex + Age + BMI + health_center+ change_metabolite) 
yf <- estimateDisp(y, as.data.frame(design), robust=TRUE)
fit <- glmQLFit(yf, design)
qlf <- glmQLFTest(fit, coef="change_metabolite")
topTags(qlf) 

Thanks for your suggestions!

edger • 605 views
ADD COMMENT
3
Entering edit mode
@gordon-smyth
Last seen 6 hours ago
WEHI, Melbourne, Australia

You say you have paired samples (before and after diet). The whole purpose of pairing is to control for factors such as age, sex, BMI and health center, so you do not need to do a paired analysis and add all those factors to the model as well. That would just be doubling up.

If the paired samples have been constructed properly, then you just need:

design <- model.matrix( ~ ID + Timepoint)

So it is all much simpler than what you're currently doing. This is the standard format for paired experiments.

You cannot include change_metabolite in the model. Including patient-specific variables in the model is incompatible with a paired analysis.

Even if you included Metabolite in a completely different non-paired analysis, the Metabolite concentration would need to be a log-scale. Taking differences of unlogged Metabolite concentrates (as you have to get the change variable) is not a meaningful thing to do.

We have not tested edgeR on metabolic data but the QL pipeline will probably work ok.

ADD COMMENT
1
Entering edit mode

Thanks a lot Gordon!

ADD REPLY

Login before adding your answer.

Traffic: 1034 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6