Help with constructing design matrix for edgeR/limma
Entering edit mode
Last seen 24 minutes ago

Hi all,

I am uncertain about constructing a design matrix for my study below, enter image description here

Our comparison would be contr.matrix <- makeContrasts(Condition = Infertile-Healthy,levels = colnames(design)). We are interested in the average differences driven by human fertility, hopefully taking into consideration of the infertile subpopulations (4 levels: RIF, RM, RFL and RIF_RM), and because PCA shows Lib_batch accounts for the highest proportion of variance in the data (followed by Pathology) so that will be accounted for.

However, all the healthy donors have no associated pathologies so design <- model.matrix(~0 + Condition + Lib_prep + Pathology, data = dgeList$samples) would give columns that are linearly dependent/not of full rank. We saw no significant differential expression when pathology is not accounted for, and we understand the sample size for each subpopulation is too small to be assessed independently. Would you be able to offer some advice in regard to constructing an appropriate design matrix for this study?

Thanks very much!!

Cheers, A

limma design.matrix model.matrix edgeR • 134 views
Entering edit mode
Last seen 3 hours ago
WEHI, Melbourne, Australia

If the infertile subpopulations have different expression profiles then you need

design <- model.matrix(~Pathology)

and not include Condition in the model as it is redundant. You can then form a contrast between the average of the infertile groups vs healthy.

I would also compute array quality weights in limma, to adjust for poorer quality samples.

However, if you have a substantial batch effect associated with Lib_batch, then you are in trouble. As a factor, Lib_batch is highly confounded with Pathology so the two effects can only be separated to a small extent. This problem may be unsolvable.

In the end, your question here is more of a research question than a software question, so you need to consult with your principal investigator and/or with a senior bioinformatician at your institution.

Entering edit mode

Thanks Gordon, that actually helps! I included Pathology in the design matrix and formed a contrast to find the difference between the average of 4 pathology groups and healthy and we saw some interesting findings!


Login before adding your answer.

Traffic: 222 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6