Nested Interaction in edgeR and composition bias
Entering edit mode
Last seen 3.5 years ago


The question I would like to ask is : Using edgeR's GLM methode nested interaction between two explaining variable, one continous variable (unlinked to genotype), the second genotype, is this interaction affected by the compositional biases ?

The logic of my reasoning is as follow : Any potential compositional biases (such as differing sequences, slightly different length and normalisation), should not be affected differently by the continuous variable.

Are these assumption correct ?

experimental design

row            Condition                     Species
1                       0                                   0
2                       0                                   1
3                       0                                   0                            
4                       0                                   1
10                      1.5                                   1
11                       1.5                                  0
12                      1.5       1
edger rna-seq • 434 views
Entering edit mode
Last seen 21 minutes ago
WEHI, Melbourne, Australia

I assume the technical biases you are asking about are differences in GC content or gene length for the same gene between the two species.

Yes the technical biases should, in principle, cancel out of any interaction term. That would be so whether the interaction is species x factor or species x covariate.

Entering edit mode
Yunshun Chen ▴ 790
Last seen 4 weeks ago

I am not sure what your question is.

If you are concerned about the composition biases between samples, then the edgeR scale normalization (eg. TMM) would take care of it. It has nothing to do with the interaction between your explanatory variables.

If your question is about how to incorporate a continuous variable into the design matrix, then it depends on the number of time points you have in your data. If your data only has two time points, as shown in your design, then you can simply treat it as a two-level factor and proceed with the standard edgeR DE analysis pipeline.

Entering edit mode

Thank you for your speedy answer.

As clarification, my question is more about the edgeR assumptions being broken in this situation. From my understanding, edgeR's process assumes the equality of such things as CG and length of genes (my use of compositional biases was indeed wrong). In this situation, varying species are being compared these assumption are thus somewhat broken.

However, in the case of the analysis of an interaction (as described in my initial question), my reasoning is that such biases does not influence the outcome : Since the CG and length does not vary between one of the explaining variable, such biases's influence should not change based on that explaining variable. By this logic, the analysis of that interaction is not affected by those biases.

P.S. The design matrix provide is indeed incomplete as there is three value for the condition. However I don't think this has any bearing on the answer.

Thank you.


Login before adding your answer.

Traffic: 148 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6