Question

Problem with including batch effects in linear model in LIMMA- samples are removed

0

Entering edit mode

sandra.murphy ▴ 20

@sandramurphy-22841

Last seen 5.2 years ago

Hello all,

I have a question regarding batch effect correction with LIMMA. I have some proteomics data, and when I do my PCA plot I see that samples cluster by patient rather than my treatment (I have three samples per patient, each sample with a different treatment). For this reason I would like to include patient as a batch effect in my linear model (rather than use removebatcheffects because I then want to perform differential analysis with LIMMA).

My plan was to use the following:

design = model.matrix(~0+Patient+Treatment, data=metadata)
fit1 = lmFit(log2all.df.filter,design = design) 
cont <- makeContrasts(TreatmentCombo-TreatmentControl, TreatmentCombo-TreatmentLipid, TreatmentLipid-TreatmentControl, levels = design)
fit2 = contrasts.fit(fit1,contrasts = cont)
fit3 <- eBayes(fit2)

This does work, however when I view my design instead of my five patients, there are only four listed in the design table.

Similarly if I change the order of the design to:

design = model.matrix(~0+Treatment+Patient, data=metadata)

and view the design, there are only two treatments included in the design instead of the three treatments that I have (so I can't proceed with my DE analysis).

Is anyone aware of a solution to this? Any help is much appreciated!

Many thanks, Sandra

limma batch effects linear model • 1.3k views

ADD COMMENT • link updated 5.7 years ago by James W. MacDonald 68k • written 5.7 years ago by sandra.murphy ▴ 20

0

Entering edit mode

For the benefit of future readers, let me note that your description of the two design matrices is actually reversed. The first design matrix will produce linear model coefficients for all the Patients while the second will produce coefficients for all the Treatments. When you use ~0+ to remove the intercept, the intercept can only be removed once, and this is always applied to the first factor in the model (Patient in the first case and Treatment in the second).

A linear model only has one intercept to remove, no matter how many factors there are in the model, so it is only possible to expand out one of the factors when the intercept is removed.

ADD REPLY • link 5.7 years ago Gordon Smyth 53k

score 0 · Answer 1 · 2020-02-05

In R, the default contrasts method is to use a 'treatment contrasts', which means that one of your subjects is considered the 'baseline' subject, and all the other patient coefficients estimate the difference between a given patient and the baseline patient. In your case this doesn't really matter, as the patient coefficients are just nuisance variables (variables that control for something that you know affects the outcome, but isn't of interest otherwise). As an example of what I mean, here is a simple model matrix:

> stuff <- gl(3,2)
> stuff
[1] 1 1 2 2 3 3
Levels: 1 2 3
> model.matrix(~stuff)
  (Intercept) stuff2 stuff3
1           1      0      0
2           1      0      0
3           1      1      0
4           1      1      0
5           1      0      1
6           1      0      1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")$stuff
[1] "contr.treatment"

Where the intercept estimates the mean of the first group, and stuff2 and stuff3 are the difference between the intercept and the two other groups. You can use other contrasts, but the treatment contrast is pretty easy to interpret, and is the default, so that's what (I would imagine) most people do.