Question

edgeR design matrix

0

Entering edit mode

R ▴ 40

@r-5604

Last seen 3.7 years ago

Germany

I have a tumor vs normal experiment. I have 48 samples of tumor and 48 of normal for the same patient. The samples are of different sex, male and female.

How do I include sex as a paramenter in the analysis.

This is my current setup

group <- colnames(df.iso.mm2)
group[grep("\\dR", colnames(df.iso.mm2))] <- "Normal"
group[grep("\\dG", colnames(df.iso.mm2))] <- "Tumor"
group <- factor(group)

> group
[1] Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor
[18] Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal
[35] Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor
[52] Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal
[69] Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor
[86] Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal Tumor Normal
Levels: Normal Tumor

patient <- factor(sapply(str_split(colnames(df.iso.mm2), 'R|G'), function(x) x[[1]]))

> patient
[1] 100 100 106 106 122 122 124 124 126 126 134 134 141 141 167 167 185 185 192 192 235 235 239 239 240 240 243 243 246
[30] 246 261 261 267 267 26 26 270 270 279 279 299 299 301 301 305 305 335 335 342 342 350 350 356 356 35 35 361 361
[59] 366 366 367 367 377 377 379 379 388 388 400 400 402 402 46 46 48 48 55 55 57 57 60 60 68 68 70 70 73
[88] 73 77 77 82 82 93 93 94 94
48 Levels: 100 106 122 124 126 134 141 167 185 192 235 239 240 243 246 26 261 267 270 279 299 301 305 335 342 35 ... 94

des <- model.matrix(~patient+group)
fit <- glmFit(y2, des)
lrt <- edgeR::glmLRT(fit)
tab <- topTags(lrt, n=Inf)@.Data[[1]]

Sex <- colnames(df.iso.mm2)
Sex[grep("35|48|57|68|70|77|122|126|134|185|235|243|246|270|267|301|342|356|366|367|377|379|388",colnames(df.iso.mm2))] <- "M"
Sex[grep("26|46|55|60|73|82|93|94|100|106|124|141|167|192|239|240|261|279|299|305|335|350|361|400|402",colnames(df.iso.mm2))] <-"F"
Sex <- factor(Sex)

[1] F F F F M M F F F F M M F F F F M M F F M M F F F F M M F F F F F F F F M M F F F F M M F F F F M M F F M M M M F F M
[60] M M M M M M M M M F F F F F F M M F F M M F F M M M M F F M M F F F F F F
Levels: F M

edger limma • 1.3k views

ADD COMMENT • link updated 9.2 years ago by svlachavas ▴ 840 • written 9.2 years ago by R ▴ 40

score 2 · Answer 1 · 2015-09-25

2

Entering edit mode

Ryan C. Thompson ★ 7.9k

@ryan-c-thompson-5618

Last seen 6 weeks ago

Icahn School of Medicine at Mount Sinai…

If your only goal is to test for differential expression between normal and tumor while controlling for effects specific to patient and sex as batch effects, then your existing design already does that. Since patient is nested within sex, controlling for patient already controls for sex as well. The only time you would need to use a different strategy would be if you wanted to test for DE between males and females (in which case you would use a design of ~group + sex and use duplicateCorrelation on patient).

Also, you don't have to do @.Data[[1]] to get the data table from the topTags result. Just use as.data.frame.

ADD COMMENT • link 9.2 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

If you want to go the duplicateCorrelation route, you'd have to shoot this stuff through voom first, no?

ADD REPLY • link 9.2 years ago Steve Lianoglou ★ 13k

1

Entering edit mode

Yes.

Well, the forum won't let me post such a short answer, so I'll add this; right now, the OP's model has a single fold-change for all patients. If you wanted to incorporate sex in some way, you could set:

design <- model.matrix(~patient + group:sex)
design <- design[,!grepl("groupN", colnames(design))] # full rankedness

The second-last and last coefficients represent the tumour-normal log-fold change in male and females, respectively. In this manner, you can define sex-specific DE between tumour and normal samples.

ADD REPLY • link 9.2 years ago Aaron Lun ★ 28k

score 0 · Answer 2 · 2015-09-25

Dear R,

I havent use edger but only the Limma package-however i have a strong belief(and maybe some of the specialists of the group will provide a more validate answer), that you don't have to include sex factor in your analysis, because the analysis you want to perform is paired and any possible related differences due to sex factor will be probably absorved in the "pairs" factor(your factor "patient") But this maybe is the case for Limma and not for edger, so perhaps someone can give a more detailed opinion on this matter.

Best,

Efstathios