Question

Limma fixed and random effects

0

Entering edit mode

jbatsx • 0

@jbatsx-11399

Last seen 7.4 years ago

I have read around and cannot find a similar example to this particular case and I'm still not clear so I apologise if this is a trivial question.

I have data for an experiment which consisted of two diseases and a control group. The disease groups (C1, C2) consists of those which have been treated and those which have not been treated (C1_treat, C1_ntreat). The treated and untreated groups are paired so the sample for treated comes from treated tissue, and the ntreat sample comes from the untreated tissue from the same patient. There are also two control samples from each patient and I should control for this.

An example of the sample data looks like this:

Group	Gender	Hospital	Patient
C2_treat	F	1	5
C2_ntreat	F	1	5
C2_treat	M	2	6
C2_ntreat	M	2	6
CTRL	M	1	1
CTRL	M	1	1
CTRL	F	2	2
CTRL	F	2	2
C1_treat	M	1	3
C1_ntreat	M	1	3
C1_treat	M	2	4
C1_ntreat	M	2	4

I want to perform the following contrasts: C1_treat–CTRL, C1_ntreat – CTRL, C1_treat – C1_ntreat and the same for the C2 group.

The samples come from 3 sampling hospitals, and there appears to be a gender effect (Xist and amongst others in the top DEGs) so I want to control for these factors also. There are several samples which do not have a paired sample so I cannot model patient as a fixed effect as the design matrix is not of full rank. I am currently modelling gender and Hospital using a fixed effect and Patient as a random effect using this model:

design <- model.matrix(~0 + f + Gender + Hospital)

corfit_main <- duplicateCorrelation(affyTable, design, block = Patient)

fit <- lmFit(affyDat, design,block = Patient, correlation = corfit_main$consensus)

The results are as expected I am just unsure of what duplicateCorrelation is doing here. Is the assumption that the intra-patient correlation is the same for all pairs? Is this assumption valid given that some of the pairs are within control samples, and the other pairs are between treated and untreated patients?

Thanks.

limma • 3.2k views

ADD COMMENT • link updated 7.3 years ago by Gordon Smyth 50k • written 7.4 years ago by jbatsx • 0

score 2 · Answer 1 · 2016-12-19

It depends on how the paired control samples were collected from patients 1 and 2. If, for example, each pair of control samples were just technical replicates of the same sample run on different arrays, then they would be a lot more correlated than the untreated/treated pairs (which, presumably, were collected separately over different days, given typical timespans of treatments for humans). In such cases, it would probably not be appropriate to model them as separate samples in duplicateCorrelation - instead, I would just average them and treat the averaged values as one sample. However, if the collection protocol for the paired controls is the same as that for the paired treated/untreated samples, then what you're currently doing is probably okay.

I don't think you have to worry about the fact that some pairs are comprised of treated/untreated samples. Any treatment effect should have been accounted for by the treatment term (presumably f) in your design matrix. This means that it shouldn't contribute to the correlations between paired samples. Thus, the true correlations should be comparable between all patients (pending the point above). Of course, it's possible that the additive design is wrong, which will manifest as increased variability/reduced correlations for treated/untreated pairs - but that's something you'll just have to accept.

score 1 · Answer 2 · 2016-12-19

The reason that samples from the same patient are correlated is that each patient has their own baseline expression values. This is true whether they get the treatment or not.

You are perhaps misunderstanding what the intra-correlation is. It is the correlation that remains after all the treatment effects have been removed. The fact that some of the samples are treated and some are not does not bias the correlation calculation because the correlation is corrected for any treatment effects. For example, the correlation will not be reduced if one sample is treated while the other from the same patient is not -- the underlying baseline correlation will remain unaffected.

So, yes, the assumption is that the intra-patient correlation is the same for each pair, and this is assumption is usually reasonable. A similar assumption is made by pretty much any random effects analysis you will read in the statistics literature.