Search
Question: limma: how to use pairing and adjust for covariates
1
gravatar for gregory.l.stone
7 months ago by
gregory.l.stone10 wrote:

I am interested in looking at the differences in the effect of a condition on each sex. The contrast I am interested in is (Female.post - Female.pre) - (Male.post - Male.pre). I am uncertain as to how best incorporate pairing into my study and adjust for covariates.

One way I have tried to incorporate pairing is to randomize the patient effect by using duplicateCorrelation(). My design is therefore:

f = paste(dge$samples$sex, dge$samples$condition, sep = ".")
f = factor(f, levels = c("Male.pre", "Male.post", "Female.pre", "Female.post"))

d = model.matrix(~ 0 + f + age + dm + axc, colData)

cont = makeContrasts(M.PvP="fMale.post-fMale.pre", F.PvP="fFemale.post-fFemale.pre", sexDiff="(fFemale.post-fFemale.pre) - (fMale.post-fMale.pre)", levels = d)

where colData is my list of covariates, dge&samples is the colData object in my dge object containing all sample an count info, M.PvP is the effect of condition on males, F.PvP is the effect of condition of females, and sexDiff is the contrast of interest (the difference in the effect of condition on each sex). I get around 274 genes for the contrast of interest.

However, I am concerned that using duplicateCorrelation() in addition to adjusting for covariates in the design matrix is over-adjusting and assumes a linear relationship between my outcome and my covariates.

So another strategy I have tried is to block for the pairing effect in the design matrix as such:

design <- model.matrix(~ patient + sex:condition, colData)
design <- design[,!grepl("pre", colnames(design))]
colnames(design) <- sub(":", "_", colnames(design))
con <- makeContrasts(sexDiff = "sexFemale_conditionpost - sexMale_conditionpost", levels=design)

However, when I go to do my fit, I get the error "In contrasts.fit(fit, con) : row names of contrasts don't match col names of coefficients". Additionally, I only get 6 genes in my contrast of interest.

 

What strategy should I be going with? Thank you so much for taking the time to help!

ADD COMMENTlink modified 7 months ago by Aaron Lun17k • written 7 months ago by gregory.l.stone10

If you got an error in your second approach, how did you still manage to test for DE genes with that contrast? In any case, provide a minimum working example that can recapitulate the error.

ADD REPLYlink modified 7 months ago • written 7 months ago by Aaron Lun17k

When I make my contrast matrix (following con <- makeContrasts(sexDiff = "sexFemale_conditionpost - sexMale_conditionpost", levels=design)), I get the following message: Warning message: In makeContrasts(sexDiff = "sexFemale_conditionpost - sexMale_conditionpost",  : Renaming (Intercept) to Intercept

When I make my fit based on my contrast matrix (fit = lmfit(voom, design) fit2 = contrasts.fit(fit, con)) I get the following:

Warning message:
In contrasts.fit(fit, con) :
  row names of contrasts don't match col names of coefficients

 

Limma still allows me to create these objects

 

ADD REPLYlink written 7 months ago by gregory.l.stone10

That's a warning, not an error.

ADD REPLYlink written 7 months ago by Aaron Lun17k
1
gravatar for Aaron Lun
7 months ago by
Aaron Lun17k
Cambridge, United Kingdom
Aaron Lun17k wrote:

I would go with your second approach, i.e., block on patient in the design matrix. duplicateCorrelation is somewhat anticonservative so the greater number of DE genes in your first approach is not a surprise. There are also a number of assumptions (e.g., homogeneity of patient effects) and compromises (e.g., using the same consensus correlation for all genes) involved in using duplicateCorrelation, in addition to the linearity issues you've already mentioned. All of this is avoided with the second approach, albeit with some loss of power as information is spent on estimating the patient effects. Fewer DE genes, perhaps, but better safe than sorry.

ADD COMMENTlink modified 7 months ago • written 7 months ago by Aaron Lun17k

When I block on patient in the design I now find I get the following message when normalizing with voom:

> v = voomWithQualityWeights(dge, design, plot=FALSE)
Coefficients not estimable: sexMale.conditionpost sexFemale.conditionpost

The same message prints repeatedly. I have samples for each condition, though. How can I correct this?

ADD REPLYlink modified 7 months ago • written 7 months ago by gregory.l.stone10

It seems like you haven't removed the pre* coefficients in the second design matrix. If that's not the case, please post a minimum working example that reproduces the warning.

ADD REPLYlink written 7 months ago by Aaron Lun17k

You're right! I apologize for my carelessness. However, if I am not specifying an intercept, don't I need the pre terms in my design so that I can make a contrast like the following: con <- makeContrasts(sexDiff = "(sexFemale_conditionpost - sexFemale_conditionpre) - (sexMale_conditionpost - sexMale_conditionpre)", levels=design_noInt)?

ADD REPLYlink written 7 months ago by gregory.l.stone10

No. Once you've pruned the design matrix, the interaction terms are the post/pre log-fold changes.

ADD REPLYlink modified 7 months ago • written 7 months ago by Aaron Lun17k

So in the case when my design matrix is ~ patient + sex:condition, I should remove the pre terms from the design and then construct my contrast matrix: makeContrasts(sexDiff = "sexFemale_conditionpost - sexMale_conditionpost", levels=design)

Now, in the case when my design matrix is ~ 0 + patient + sex:condition, I also should remove the pre terms and construct the same contrast matrix.

 

I'm now confused as to what the difference is between the designs ~ patient + sex:condition and ~ 0 + patient + sex:condition. I'm sure I have misunderstood something you've said along the way. I apologize for my confusion, and thank you so much for being so generous with your time!

ADD REPLYlink written 7 months ago by gstone20

The only difference (in the pruned designed matrices) is that, if you have an intercept, it represents the average log-expression in the first patient, and all other patient terms represent the log-fold change relative to the first patient. Without an intercept, each patient term directly represents the average log-expression in that patient. In both cases, the interpretation of the (interesting) interaction terms is unchanged.

ADD REPLYlink modified 7 months ago • written 7 months ago by Aaron Lun17k

Ok thank you, that makes a lot of sense. So just that I am clear, in both cases, sexMale_conditionpost represents the log-fold change from pre to post in males?

ADD REPLYlink modified 7 months ago • written 7 months ago by gstone20

I would assume so. Check the design matrix to be sure; you should see entries of 1 for post-treatment male samples, and zero for everything else.

ADD REPLYlink written 7 months ago by Aaron Lun17k

Ok great. Thank you very much for all of your help!

ADD REPLYlink written 7 months ago by gstone20

I am implementing the design you've recommended and am concerned that I am not getting the contrast I am interested in. The columns Male.post and Female.post in my design matrix have a 1 for all male and female samples, respectively, taken post condition. When I contrast these, Female.post-Male.post, I initially thought that the contrast I was doing was (Female.post-Female.pre)-(Male.post-Male.pre), but the interaction between sex and pre samples were removed to make the design full-rank, making me believe that I am getting the change between the sexes at time point post, not the change between the sexes between time points pre->post, which is what I'm interested in. Am I interpreting this incorrectly? My design is ~ patient + sex:condition

Is this a situation where I would use a nested factorial design? If so how would I go about doing that?

ADD REPLYlink modified 7 months ago • written 7 months ago by gstone20

Male.post represents the log-fold change of post over pre in males. Female.post represents the log-fold change of post over pre in females. The contrast Female.post - Male.post tests the null hypothesis that the post/pre log-fold changes are the same between males and females. If this is not clear, I can only suggest that you consult a local statistician to help with your understanding.

ADD REPLYlink modified 7 months ago • written 7 months ago by Aaron Lun17k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 384 users visited in the last hour