Different results from DGE using different subgroups in model.matrix
1
0
Entering edit mode
Fabian • 0
@25565e27
Last seen 20 days ago
Austria

Hi, I am examining wether there is a difference in gene expression between patients with a biomarker below and above the median. Lets call that variable "group". I want to examine the DGE of patients below and above the median group-levels in different subgroups: Sex, Diabetes-status, Age (Below and above 60 years).

However, when investigating Diabetes status for example I get different results for my DGE analysis with the following model.matrices:

model.matrix( ~ 0 + group : Diabetes)
model.matrix( ~ 0 + group : Diabetes + Sex + Agecutat60)

I wanted to use that second model.matrix because then I wouldnt have to estimate the dispersion for every single subgroup with this code:

y <- DGEList(GenewiseCounts, 
             group = group, 
             genes = GenewiseCounts[, 1, drop=FALSE]
             )
y <- estimateGLMCommonDisp(y, design, verbose=TRUE)
y <- estimateGLMTagwiseDisp(y, design)

However, I dont necessarily want to "adjust" for those other variables but rather want to keep them in the design matrix in order to change the contrasts later on. I am unsure about those differing results with the different designs of the modelmatrix. Is it that wrong to keep them in my model.matrix?

edgeR • 213 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 5 hours ago
WEHI, Melbourne, Australia

It is hard to tell what you are trying to do. What does your group variable represent? How does it relate to Sex, Diabetes and Age? You refer to subgroups defined by Sex, Diabetes and Age but your analysis does not divide patients into subgroups. Your model formula includes in interaction term but without the corresponding main effects.

Your post seems to show some misunderstandings about how an edgeR analysis is conducted. The design matrix must include all the relevant factors and there is no need to estimate dispersions for subgroups separately. You cannot "adjust" the dispersion estimates for covariates but not do the same for the DE analysis. I suggest you go back to the edgeR User's Guide and try to follow a standard analysis.

If you defined your subgroups and scientific questions more clearly, I think the analysis would be more straightforward that you are currently finding it.

ADD COMMENT

Login before adding your answer.

Traffic: 538 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6