Question

Nested Interaction in edgeR and composition bias

0

Entering edit mode

chapdelainev • 0

@chapdelainev-22259

Last seen 4.4 years ago

Hello,

The question I would like to ask is : Using edgeR's GLM methode nested interaction between two explaining variable, one continous variable (unlinked to genotype), the second genotype, is this interaction affected by the compositional biases ?

The logic of my reasoning is as follow : Any potential compositional biases (such as differing sequences, slightly different length and normalisation), should not be affected differently by the continuous variable.

Are these assumption correct ?

experimental design

row            Condition                     Species
1                       0                                   0
2                       0                                   1
3                       0                                   0                            
4                       0                                   1
... 
10                      1.5                                   1
11                       1.5                                  0
12                      1.5       1

edger rna-seq • 670 views

ADD COMMENT • link updated 4.5 years ago by Gordon Smyth 50k • written 4.5 years ago by chapdelainev • 0

1

Entering edit mode

Yunshun Chen ▴ 840

@yunshun-chen-5451

Last seen 5 weeks ago

Australia

I am not sure what your question is.

If you are concerned about the composition biases between samples, then the edgeR scale normalization (eg. TMM) would take care of it. It has nothing to do with the interaction between your explanatory variables.

If your question is about how to incorporate a continuous variable into the design matrix, then it depends on the number of time points you have in your data. If your data only has two time points, as shown in your design, then you can simply treat it as a two-level factor and proceed with the standard edgeR DE analysis pipeline.

ADD COMMENT • link 4.5 years ago Yunshun Chen ▴ 840

0

Entering edit mode

Thank you for your speedy answer.

As clarification, my question is more about the edgeR assumptions being broken in this situation. From my understanding, edgeR's process assumes the equality of such things as CG and length of genes (my use of compositional biases was indeed wrong). In this situation, varying species are being compared these assumption are thus somewhat broken.

However, in the case of the analysis of an interaction (as described in my initial question), my reasoning is that such biases does not influence the outcome : Since the CG and length does not vary between one of the explaining variable, such biases's influence should not change based on that explaining variable. By this logic, the analysis of that interaction is not affected by those biases.

P.S. The design matrix provide is indeed incomplete as there is three value for the condition. However I don't think this has any bearing on the answer.

Thank you.

ADD REPLY • link 4.5 years ago chapdelainev • 0

score 2 · Accepted Answer · 2019-11-04

2

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

I assume the technical biases you are asking about are differences in GC content or gene length for the same gene between the two species.

Yes the technical biases should, in principle, cancel out of any interaction term. That would be so whether the interaction is species x factor or species x covariate.

ADD COMMENT • link 4.5 years ago Gordon Smyth 50k