Question

Extracting specific comparisons after fitting model in EdgeR

0

Entering edit mode

mohammedtoufiq91 ▴ 10

@mohammedtoufiq91-17679

Last seen 9 days ago

United States

Hi,

I am analyzing mRNA-Seq dataset using EdgeR package, I have question while extracting co-efficient comparisons after fitting the model (example below). In limma, I use simple design <- model.matrix(~0+Culture_Types) with corfit (as below), however, doing this first time time EdgeR.

corfit <- duplicateCorrelation(df, design, block=sample_ann$Pt) 
fit <- lmFit(df,design,block=sample_ann$Pt,correlation=corfit$consensus)

Here is the example of the sample annotations and design:

dput(xx)
structure(list(Samples = c("Sample_1", "Sample_2", "Sample_3", 
                           "Sample_4", "Sample_5", "Sample_6", "Sample_7", "Sample_8", "Sample_9", 
                           "Sample_10", "Sample_11", "Sample_12", "Sample_13", "Sample_14", 
                           "Sample_15", "Sample_16", "Sample_17", "Sample_18"), Patient_ID_v1 = c("S13", 
                                                                                                  "S13", "S13", "S18", "S18", "S18", "S21", "S21", "S21", "S47", 
                                                                                                  "S47", "S47", "S61", "S61", "S61", "S70", "S70", "S70"), Culture_Types = c("CDxx", 
                                                                                                                                                                             "CDvv", "CDzz", "CDxx", "CDvv", "CDzz", "CDxx", "CDvv", "CDzz", 
                                                                                                                                                                             "CDxx", "CDvv", "CDzz", "CDxx", "CDvv", "CDzz", "CDxx", "CDvv", 
                                                                                                                                                                             "CDzz"), Cohorts = c("A", "A", "A", "A", "A", "A", "B", "B", 
                                                                                                                                                                                                  "B", "B", "B", "B", "A", "A", "A", "B", "B", "B")), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                                                                                                          -18L))
#>      Samples Patient_ID_v1 Culture_Types Cohorts
#> 1   Sample_1           S13          CDxx       A
#> 2   Sample_2           S13          CDvv       A
#> 3   Sample_3           S13          CDzz       A
#> 4   Sample_4           S18          CDxx       A
#> 5   Sample_5           S18          CDvv       A
#> 6   Sample_6           S18          CDzz       A
#> 7   Sample_7           S21          CDxx       B
#> 8   Sample_8           S21          CDvv       B
#> 9   Sample_9           S21          CDzz       B
#> 10 Sample_10           S47          CDxx       B
#> 11 Sample_11           S47          CDvv       B
#> 12 Sample_12           S47          CDzz       B
#> 13 Sample_13           S61          CDxx       A
#> 14 Sample_14           S61          CDvv       A
#> 15 Sample_15           S61          CDzz       A
#> 16 Sample_16           S70          CDxx       B
#> 17 Sample_17           S70          CDvv       B
#> 18 Sample_18           S70          CDzz       B

Patient_ID_v1 <- factor(xx$Patient_ID_v1)
Culture_Types <- factor(xx$Culture_Types)
design <- model.matrix(~0+Patient_ID_v1+Culture_Types)
## dispersion estimated:
y <- estimateDisp(y,design)
fit <- glmQLFit(y, design)

## What are the co-efficient to be chosen here, `coef = `

1. To detect genes that are differentially expressed in CDxx vs CDzz ?:

qlf <- glmQLFTest(fit, coef=????)

2. To detect genes that are differentially expressed in CDxx vs CDvv ? :

qlf <- glmQLFTest(fit, coef=????)

3. To detect genes that are differentially expressed in CDzz vs CDvv ?:

qlf <- glmQLFTest(fit, coef=????)

OR,

Maybe a one table to all comparisons of interest like we do in `contrast.matrix`:

con <- makeContrasts(
  CDxx_vs_CDzz = CDxx-CDzz,
  CDxx_vs_CDvv = CDxx-CDvv,
  CDzz_vs_CDvv = CDzz-CDvv, levels=design)

Thank you,

Mohammed

edgeR limma design RNASeq model.matrix • 1.3k views

ADD COMMENT • link 3.1 years ago mohammedtoufiq91 ▴ 10

score 1 · Answer 1 · 2022-12-05

1

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 11 hours ago

WEHI, Melbourne, Australia

Specifying coefficients and contrasts is almost exaclty the same in edgeR as in limma so, if you know limma, then use the same rules in edgeR.

ADD COMMENT • link 3.1 years ago Gordon Smyth 53k

0

Entering edit mode

Gordon Smyth , thank you. I followed the edgeR user guide section "3.4 Additive models and blocking", this has paired samples and blocking sub-sections.

levels("CDvv", "CDzz", "CDxx")

1. To detect genes that are differentially expressed in CDxx vs CDzz ?:

qlf <- glmQLFTest(fit, contrast=c(0,0,0,0,0,0,-1,1))

2. To detect genes that are differentially expressed in CDxx vs CDvv ? :

qlf <- glmQLFTest(fit, coef=8)

3. To detect genes that are differentially expressed in CDzz vs CDvv ?:

qlf <- glmQLFTest(fit, coef=7)

ADD REPLY • link 3.1 years ago mohammedtoufiq91 ▴ 10

1. To detect genes that are differentially expressed in CDxx vs CDzz ?:

2. To detect genes that are differentially expressed in CDxx vs CDvv ? :

3. To detect genes that are differentially expressed in CDzz vs CDvv ?:

Maybe a one table to all comparisons of interest like we do in contrast.matrix:

1. To detect genes that are differentially expressed in CDxx vs CDzz ?:

2. To detect genes that are differentially expressed in CDxx vs CDvv ? :

3. To detect genes that are differentially expressed in CDzz vs CDvv ?:

Maybe a one table to all comparisons of interest like we do in `contrast.matrix`: