Question on creating right design matrix in edgeR
2
1
Entering edit mode
Abdul ▴ 10
@c205f17d
Last seen 5 weeks ago
United States

Hi,

I am working with multi-factor RNA-Seq experiment in EdgeR to find diff. expressed genes. The samples were processed in different batches (see below). I am trying to compare two different effects of Treatment A, and B paired samples. Additionally, I am also interested in comparisons of two groups Group_1 and Group_2, though the main goal is to compare between Treatment B vs. A.. I have a question about creating a right design matrix for the comparisonsa and use it in estimateDisp and glmQLFit.

g.Treatments <-  factor(Samples$Treatments) g.Groups <- factor(Samples$Groups)
g.Batch <-  factor(Samples\$Batch)
y <- DGEList(counts = data_counts, group = group.Treatments, remove.zeros = TRUE)


I am interested in comparing 2 Treatment, and find diff exp genes:

    des_treat.1 <- model.matrix(~Batch+Individuals+Treatments)
(OR)
des_treat.2 <- model.matrix(~Individuals+Batch+Treatments)


I am interested in comparing 2 groups, and find diff exp genes:

    des_group.1 <- model.matrix(~Batch+Groups)
(OR)
des_group.1 <- model.matrix(~Groups)


OR, use all comparison in same line, is it possible?

des_treat.group.1 <- model.matrix(~Batch+Individual+Treatments+Groups)
des_treat.group.2 <- model.matrix(~Individual+Batch+Treatments+Groups)

Samples

#>       Individuals Treatments  Groups Batch
#> ID_1            1          A Group_1     1
#> ID_2            1          B Group_1     1
#> ID_3            2          A Group_2     2
#> ID_4            2          B Group_2     2
#> ID_5            3          A Group_1     1
#> ID_6            3          B Group_1     1
#> ID_7            4          A Group_2     2
#> ID_8            4          B Group_2     2
#> ID_9            5          A Group_1     1
#> ID_10           5          B Group_1     1
#> ID_11           6          A Group_2     2
#> ID_12           6          B Group_2     2


Thank you,

Abdul

R designmatrix edgeR limma model.matrix • 248 views
1
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

See Section 3.5 "Comparisons both between and within subjects" in the edgeR User's Guide.

0
Entering edit mode

Gordon Smyth perfect, this very helpful., but only issue is batch factor in my data as the samples were processed in two batches.

#### Section 3.5 "Comparisons both between and within subjects"

design <- model.matrix(~Patient) design <- cbind(design, Healthy.Hormone, Disease1.Hormone, Disease2.Hormone)

#### Inclusion of batch factor

Assuming there is a batch effect, in Section 3.5 "Comparisons both between and within subjects" example; does the below formula looks fine?

design <- model.matrix(~Batch+Patient)
design <- cbind(design, Healthy.Hormone, Disease1.Hormone, Disease2.Hormone)

1
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

No, you cannot include a batch effect because your batches are completely confounded with the two groups of individuals. The analysis described in the edgeR Users Guide does not make baseline comparisons between the batches and hence is unaffected by any batch effect. You need to follow the edgeR analysis as it is.

0
Entering edit mode

Gordon Smyth , noted. This is just a pilot scale data that I got to work with. We are expecting to receive more data in next couple of weeks which has a balanced sample design and not confounded. Probably in that case I could add Batch in model.marix ?

1
Entering edit mode

No, your experiment is a paired-comparison with A vs B for each individual. Paired comparisons do not need to be corrected for batches. As I already said, "The analysis described in the edgeR Users Guide does not make baseline comparisons between the batches and hence is unaffected by any batch effect."

0
Entering edit mode

Thank you.