How does contrasts.fit do in limma to extract comparisons with a model having interaction terms?
1
0
Entering edit mode
Durian • 0
@55e5fbf8
Last seen 9 hours ago
Japan

For example, geneA ~ family + cur_env*pre_env.

  • cur_env and pre_env are two categorical variables and both include CT and HD two levels.
    • I want to compare the fold change between CTCT (CT at pre_env and CT at cur_env) and HDHD (HD at pre_env and HD at cur_env).

I am interested in understanding how it does be compared behind the limma code and manually extract information of CTCT and HDHD and compute the fold change of their mu coefficients.

At the moment, I am using GAMLSS to replicate limma results by model interaction terms. My problem are

  • How am I able to run similar contrast and obtain the coefficients and p value for CTCT and HDHD?
  • Is it possible to construct reduced model to know whether combinations of two variables (CTCT, HDHD, CTHD, HDCT) have effects on response variables by taking interactions into account?
# design matrix
cur_env <- c(rep("CT", 4), rep("HD", 4))
pre_env <- c(rep("CT", 2), rep("HD", 2), rep("CT", 2), rep("HD", 2))
family <- c(rep("1", 3), rep("2", 2), rep("3", 3))

cur_env <- factor(cur_env, levels=c("CT","HD"))
pre_env <- factor(pre_env, levels=c("CT","HD"))
family  <- factor(family)

design <- model.matrix(~ family + cur_env*pre_env)

# make up coefficients
# (Intercept) family1 family2 family3  cur_envHD pre_envHD cur_envHD:pre_envHD 
 3.2       2.4       2.6      0.09      -0.14      0.09      -0.11

# design
  (Intercept) family2 family3 cur_envHD pre_envHD cur_envHD:pre_envHD
1           1       0       0         0         0                   0
2           1       0       0         0         0                   0
3           1       0       0         0         1                   0
4           1       1       0         0         1                   0
5           1       1       0         1         0                   0
6           1       0       1         1         0                   0
7           1       0       1         1         1                   1
8           1       0       1         1         1                   1
attr(,"assign")
[1] 0 1 1 2 3 4
attr(,"contrasts")
attr(,"contrasts")$family
[1] "contr.treatment"

attr(,"contrasts")$cur_env
[1] "contr.treatment"

attr(,"contrasts")$pre_env
[1] "contr.treatment"
limma • 103 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 10 minutes ago
WEHI, Melbourne, Australia

The limma Users Guide explains how to test for interactions or to test for contrasts very easily. limma can work with any design matrix, but the easiest way by far is to combine pre_env and cur_env into one factor, and then use contrasts.fit to form the comparisons you want explicitly.

There is no need to form "reduced models" in limma. It is much simpler and more direct, you just form the comparisons you need directly.

At the moment, you don't seem to have tested any hypotheses in either package, limma or GAMLSS, so I don't understand what you mean by replicating results. You don't show any code from either package. GAMLSS is intended for joint modelling of the mean and scale parameter, but you don't mention why you think that might be relevant here. The design matrix you have formed doesn't seem to me to relate to your questions of interest at all closely.

ADD COMMENT
0
Entering edit mode

Thanks for the reply Gordon. I am also working on the project and have been tasked to reformulate and add some details.

We're interested in the effect of (1) the environment the organism was selected for (effectively a genotype) and (2) the environment was reared in on the mean and variance of gene expression. The specific hypothesis above is that there is a significant difference in mean and variances between flies selected and reared on the control environment compared to those selected and reared in a different (HD) environment. We are also interested in sub-hypotheses (such as how different in means and variances are the two genotypes within the HD environment).

We have figured out a limma implementation of just the differential mean testing using the approach (e.g., split into four coefficients each represent one of the four levels of the factor) you mentioned. However, to answer the variance question we want to use the GAMLSS. Additionally, we want to run a differential mean expression test with the GAMLSS as a 'replication' of the results we got from limma.

The trick is this. We understand that GAMLSS models work by comparing a model with the parameter of interest (e.g., genotype mu) with a model without. For one factor with two levels, we just omit the single mu or sigma coefficient in the nested model. However, our factor has four levels, and would like help on how to approach this to test the above hypotheses.

We thought that since limma accomplishes this for mean testing using contrasts.fit(), a similar underlying logic would apply. But we are bit unclear on the greater details of contrasts.fit() and how we could translate that logic into the GAMLSS framework.

Hope the help request is a bit clearer.

ADD REPLY
0
Entering edit mode

I refered you to the DiffVar package 11 months ago. Did you consider it?

Your clarification makes it clear that your issue is not with limma, which has already worked fine for you, but with gamlss, because you can't figure out how to test the same hypotheses with gamlss. I agree that trying to test the CT and HT specific fold-changes using a reduced model approach would be tricky, but I can't help you with the gamlss package. I suggest you ask the authors of that package. I suppose that you could probably limma's contrastAsCoef() function to make a reduced model for gamlss, but you have to keep in mind that limma takes advantage of empirical Bayes methods, meaning that you cannot replicate limma significance results in any univariate package such as gamlss.

Nowadays, since the release of edgeR v4, if I wanted to test for variance differences between genotypes with RNA-seq data, I would extract the adjusted unit deviances and adjusted unit df from glmQLFit, and then fit an ordinary gamma glm to the genewise unit deviances for each gene with the unit df as weights.

Alternatively, you could run voomLmFit with var.group=genotype, which would tell you whether there is a overall variance difference averaged over genes.

ADD REPLY

Login before adding your answer.

Traffic: 1083 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6