Question: How does the function model.matrix (to define experimental design) really works ?
0
18 months ago by
Aurora10
Aurora10 wrote:

Good morning,

I am working with edgeR to perform differential expression on rna-seq data.

I have a design with two variables :  type (obese/normal) and treatment

type is a vector with two levels (obese/normal)

treatment is a vector with 5 levels ( 5 different treatments)

But when I perform this line of code : design_matrix = model.matrix(~0+type+treatment)

the design_matrix that results have only 6 columns : typeLean, typeObese and only 4 columns for treatments ( the treatment that appears to be the first level of the treatment vector is not in my design_matrix ! )

Does anyone knows why ? How could I have all the treatments in my design_matrix ?

Thank you,

Have a good day

modified 18 months ago by Ryan C. Thompson7.4k • written 18 months ago by Aurora10
Answer: How does the function model.matrix (to define experimental design) really works
4
18 months ago by
The Scripps Research Institute, La Jolla, CA
Ryan C. Thompson7.4k wrote:

It is important to note that a factor with K levels only adds K-1 coefficients to the design matrix, because there are only K-1 independent differences between K groups. So your design matrix should have 1 + (5-1) + (2-1) = 6 coefficients. For more information on how factors are encoded into a design matrix, have a look here. You are most likely using the "dummy coding" since that is the default.

You can sidestep the problem of factor coding for one of your factors by using a model with no intercept (i.e. ~0), but the rest of the factors must still be coded as normal.