edgeR design matrix
2
0
Entering edit mode
R ▴ 40
@r-5604
Last seen 3.1 years ago
Germany

I have a tumor vs normal experiment. I have 48 samples of tumor and 48 of normal for the same patient. The samples are of different sex, male and female.

How do I include sex as a paramenter in the analysis.

This is my current setup


group <- colnames(df.iso.mm2)
group[grep("\\dR", colnames(df.iso.mm2))] <- "Normal"
group[grep("\\dG", colnames(df.iso.mm2))] <- "Tumor"
group <- factor(group)

 

> group
 [1] Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor
[18] Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal
[35] Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor
[52] Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal
[69] Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor
[86] Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal Tumor  Normal
Levels: Normal Tumor

patient <- factor(sapply(str_split(colnames(df.iso.mm2), 'R|G'), function(x) x[[1]]))

 

> patient
 [1] 100 100 106 106 122 122 124 124 126 126 134 134 141 141 167 167 185 185 192 192 235 235 239 239 240 240 243 243 246
[30] 246 261 261 267 267 26  26  270 270 279 279 299 299 301 301 305 305 335 335 342 342 350 350 356 356 35  35  361 361
[59] 366 366 367 367 377 377 379 379 388 388 400 400 402 402 46  46  48  48  55  55  57  57  60  60  68  68  70  70  73
[88] 73  77  77  82  82  93  93  94  94
48 Levels: 100 106 122 124 126 134 141 167 185 192 235 239 240 243 246 26 261 267 270 279 299 301 305 335 342 35 ... 94

 

 

des <- model.matrix(~patient+group)
fit <- glmFit(y2, des)
lrt <- edgeR::glmLRT(fit)
tab <- topTags(lrt, n=Inf)@.Data[[1]]

 

 

Sex <- colnames(df.iso.mm2)
Sex[grep("35|48|57|68|70|77|122|126|134|185|235|243|246|270|267|301|342|356|366|367|377|379|388",colnames(df.iso.mm2))] <- "M"
Sex[grep("26|46|55|60|73|82|93|94|100|106|124|141|167|192|239|240|261|279|299|305|335|350|361|400|402",colnames(df.iso.mm2))] <-"F"
Sex <- factor(Sex)

 


 [1] F F F F M M F F F F M M F F F F M M F F M M F F F F M M F F F F F F F F M M F F F F M M F F F F M M F F M M M M F F M
[60] M M M M M M M M M F F F F F F M M F F M M F F M M M M F F M M F F F F F F
Levels: F M

 

edger limma • 1.1k views
ADD COMMENT
2
Entering edit mode
@ryan-c-thompson-5618
Last seen 8 months ago
Scripps Research, La Jolla, CA

If your only goal is to test for differential expression between normal and tumor while controlling for effects specific to patient and sex as batch effects, then your existing design already does that. Since patient is nested within sex, controlling for patient already controls for sex as well. The only time you would need to use a different strategy would be if you wanted to test for DE between males and females (in which case you would use a design of ~group + sex and use duplicateCorrelation on patient).

Also, you don't have to do @.Data[[1]] to get the data table from the topTags result. Just use as.data.frame.

ADD COMMENT
0
Entering edit mode

If you want to go the duplicateCorrelation route, you'd have to shoot this stuff through voom first, no?

ADD REPLY
1
Entering edit mode

Yes.

Well, the forum won't let me post such a short answer, so I'll add this; right now, the OP's model has a single fold-change for all patients. If you wanted to incorporate sex in some way, you could set:

design <- model.matrix(~patient + group:sex)
design <- design[,!grepl("groupN", colnames(design))] # full rankedness

The second-last and last coefficients represent the tumour-normal log-fold change in male and females, respectively. In this manner, you can define sex-specific DE between tumour and normal samples.

ADD REPLY
0
Entering edit mode
svlachavas ▴ 830
@svlachavas-7225
Last seen 6 months ago
Germany/Heidelberg/German Cancer Resear…

 

Dear R,  

I havent use edger but only the Limma package-however i have a strong belief(and maybe some of the specialists of the group will provide a more validate answer), that you don't have to include sex factor in your analysis, because the analysis you want to perform is paired and any possible related differences due to sex factor will be probably absorved in the "pairs" factor(your factor "patient") But this maybe is the case for Limma and not for edger, so perhaps someone can give a more detailed opinion on this matter.

Best,

Efstathios

 

 

 

ADD COMMENT

Login before adding your answer.

Traffic: 822 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6