Question

limma - contrast matrix hypothesis testing

0

Entering edit mode

jarod ▴ 30

@abf04839

Last seen 23 months ago

United States

Hi All, I am working with limma and dream from the variancePartition package, and have a question on complex contrast matrix designs.

I have a group of healthy controls and then multiple different disease groups, which I want to compare to healthy controls and also want to group together in various ways to compare with healthy controls.

An example:

Lets say I have groups A,B,C and want to compare

A vs B+C

Then I can do A - (B+C)/2

But, lets say B and C are two disease groups with very different sample size. In my case one such pairing is group B with 150 samples and group C with only 30 samples.

If I do A - (B+C)/2

am I giving extra weight to the group C expression? The way I read this contrast formulation is comparing the mean in A with the average of (B + C), which seems like it would be fine. However, the contrast matrix itself then shows 1A - 0.5B - 0.5C, which suggests to me that it is weighting the average of the samples in the groups B and C equally, even though there are many fewer samples in group C? Wouldn't this then weight the samples in C by a factor of 150/30 = 5?

Or is the contrast done by combining all samples from the two groups B and C, and then computing the average? In which case it should be ok? Thanks for any help understanding this!

limma variancePartition • 2.0k views

ADD COMMENT • link updated 4.6 years ago by Gordon Smyth 53k • written 4.6 years ago by jarod ▴ 30

score 0 · Answer 1 · 2021-06-25

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 16 hours ago

WEHI, Melbourne, Australia

The contrast does what it appears to do. It weights the groups as specified, so that the contrast has the same meaning regardless of sample numbers in each group.

If you simply want to treat B & C as one group, then you don't need a contrast. In that case you would recode the disease group factor and reform the design matrix to define two groups instead of three.

ADD COMMENT • link 4.6 years ago Gordon Smyth 53k

0

Entering edit mode

Gordon, thanks for the reply!

Just to clarify, you are saying that the contrast matrix does give equal weight to the means of both groups B and C regardless of sample number in the groups?

If this is the case, could I manually weight the contrasts to achieve the outcome I want? IE A - ((4/5)*B + (1/5)*C) ?

I like the contrast matrix formulation because it is a bit more flexible for my particular purposes. I have many dozens of groups to compare in different combinations, and it is a bit cumbersome to build and evaluate new group codings for each combination separately.

Thanks!

ADD REPLY • link 4.6 years ago jarod ▴ 30

1

Entering edit mode

you are saying that the contrast matrix does give equal weight to the means of both groups B and C regardless of sample number in the groups?

Yes, naturally. Our purpose is to test questions of science wihtout introducing bias due to artifacts of sample numbers.

could I manually weight the contrasts to achieve the outcome I want? IE A - ((4/5)B + (1/5)C) ?

Yes, you can test any contrast that makes scientific sense to you. The contrast you mention is the closest to simply pooling the B and C groups, but it is not exactly the same because the variance calculation is different. Roughly speaking, the contrast will find genes that respond in either B or C or both while pooling will find genes that respond to about the same degree in both groups.

I do not know your experiment, so I can't say whether weighting groups by the number of samples is sensible, but such weighting would only rarely be the right thing to do. Perhaps it would make sense if you want to predict results in a population and your sample numbers are representative of the population numbers.

ADD REPLY • link 4.6 years ago Gordon Smyth 53k