How to correctly adjust for gender in edgeR
1
0
Entering edit mode
Guillermo • 0
@860296bd
Last seen 5 days ago
Spain

Hi, I'm interested in performing some DE analysis using edgeR package. For this purpose, I've been following the manual and some other related older contributions made here (design matrix for 4 groups in edgeR , edgeR effects of design on testing main effects and interactions, Different results in edgeR using simple vs GLM).

For my experimental design, I have samples distributed in three groups: Control, Risk and Disease samples. I also have information about the gender of the samples.

My question is, how should I adjust for gender effect on the gene expression?

Option 1 - Additive linear model

My first thought was trying a model that would use a design matrix like this:

mod_matrix1 <- model.matrix(~0+group+gender)

Option 2 - One-way layout

But then I thought that maybe a one-way layout model could be more useful:

group_gender <- paste(group, gender, sep='_')
mod_matrix2 <- model.matrix(~0+group_gender)

For this second approach, I was thinking of testing for the average, using something like this (an example):

makeContrasts( (Control_Male + Control_Female)/2 - (Risk_Male + Risk_Female)/2, 
               levels=mod_matrix2)

Option 3 - Interaction full model

Also, another option I was originally thinking of using was:

mod_matrix3 <- model.matrix(~0+group*gender)

Option 4 - Interaction model

And also this one:

mod_matrix4 <- model.matrix(~0+gender + gender:group)

Question

My biggest concern is that I'm not sure which way is better for adjusting for gender. As I understand (which might be obviously wrong), Option 1 let me adjust for gender assumming its effect in gene expression is the same in all the groups (Control, Risk and Disease), which might be, in fact, wrong. I think maybe Option 2 is more accurate, since gender might be affecting each group in a different way. And, about Options 3 and 4, most contributions usually state that they are the most difficult ones to interpret and might be of use only in specific cases.

Can you help me understand which way is better for adjusting my model for gender specific effect?

Also, I was trying to perform a more complex model, with two factors (Treatment-4 levels, and GlucemicControl-2 levels) and multiple "covariates" (Gender-2 levels, Age, BMI). For this case, I was thinking of using a one-way layout merging the two factors, but depending on the answers to the previous question maybe it is more interesting using a one-way layout merging Treatment, GlucemicControl and Gender.

Thank you in advance!

design glm edgeR • 281 views
ADD COMMENT
1
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

Options 2 to 4 are all equivalent. They are just different ways of parametrizing the same model, depending on what hypotheses are most of interest. They all give exactly the same results. The only difference between them is the ease with which particular contrasts are extracted.

You already know the difference between options 1 and 2. Option 1 adjusts for baseline differences between males and females but assumes the relative disease vs control effects are the same for both genders. Option 2 allows you to check for gender-specific disease effects.

Which model is right for you depends on what hypotheses you want to test and the scientific background to your experiment. You ask "which way is better for adjusting my model for gender specific effect?" Obviously that would be option 2 or, even more directly, option 4.

You propose a contrast using averages, but I don't see the point of that if you're interested in gender-specific effects.

ADD COMMENT
0
Entering edit mode

Hi, Gordon!

Thanks for your impressive fast response.

When I say "which way is better for adjusting my model for gender specific effect?", I mean "correcting" instead of "adjusting". My bad here, English is not my first language.

We are not interested in searching gender-specific effects. We want to study differentially expressed genes between the groups of interest (Control, Risk, Disease), regardless the gender. As if gender was a batch effect, we want to correct its effect in gene expression.

ADD REPLY
1
Entering edit mode

Approach 1 then.

ADD REPLY

Login before adding your answer.

Traffic: 738 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6