Question: interaction formulas without replicates
gravatar for rea
2.2 years ago by
rea10 wrote:

I refer to section 3.3.2 on nested interaction formulas in the edgeR user guide.

I wonder if the coefficient TreatDrug.Time.1h provides the logFC and associated FDR corresponding to (drug.1h-drug.0h)-(placebo.1h-placebo.0h). Is this interpretation correct?

Does it make sense to use nested interaction formulas when I have only one replicate for each level of Time and Treatment. In your example placebo.0h has two samples. 

My study has one sample for each combination of the levels of two factors.Applying nested interaction formulas, all genes reported FDR = 1. I wonder if this result is due to the absence of replicates. 

Might you suggest, alternatively, other ways where I can test the influence of multiple factors in differential expression results in presence of just one replicate for each combination of their levels?


ADD COMMENTlink modified 2.2 years ago by Aaron Lun21k • written 2.2 years ago by rea10

You need to add the edgeR tag, otherwise the maintainers don't get notified.

ADD REPLYlink written 2.2 years ago by Aaron Lun21k
gravatar for Aaron Lun
2.2 years ago by
Aaron Lun21k
Cambridge, United Kingdom
Aaron Lun21k wrote:

The answers to your questions are:

1) No, the TreatDrug:Time1h coefficient represents the log-fold change between drug.1h and drug.0h groups. Have a look at the design matrix if you're uncertain. (I rarely trust the column names provided by model.matrix, because the meaning of those names will depend on the design formula, even for designs that are mathematically equivalent. It's safer to just check the design matrix directly.)

2) If you have only one replicate for each combination, then using a nested interaction design will not give you any residual d.f. for dispersion estimation. Consider using a simpler model in order to free up residual d.f. for estimation. For example, you could use an additive model where you assume that the drug/time effects are independent, or a model with time as a real-valued covariate if you have enough time points.

3) Lack of replicates will reduce power to detect DE between conditions. Obviously, you won't have much data to reject the null hypothesis if you only have one observation for each condition. I also assume you manually input a dispersion value, which may or may not be appropriate; if it's too large, then that will also result in conservativeness.

4) Read the relevant section (2.11) of the edgeR user's guide on what to do without replicates.

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Aaron Lun21k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 346 users visited in the last hour