Question

What's the relationship between "design" and "contrasts"?

0

Entering edit mode

kjo ▴ 70

@kjo-11078

Last seen 7.3 years ago

The explanations I've heard of "design" and "contrasts" make them sound like "overlapping"/non-independent concepts. I.e. it would be possible to give values to these two parameters that are inconsistent with each other.

Is that the case? Or can the two parameters be specified completely independently from each other?

deseq2 • 1.1k views

ADD COMMENT • link updated 7.3 years ago by Ryan C. Thompson ★ 7.9k • written 7.3 years ago by kjo ▴ 70

score 4 · Accepted Answer · 2017-01-12

A design matrix is simply the matrix representation of the system linear equations that you wish to solve using regression, as described here: https://en.wikipedia.org/wiki/Design_matrix. Each column represents a coefficient, and each row represents a sample. It's possible to represent the same model using two different design matrices, just like it is possible to represent the same space using two different coordinate systems.

Once you fit a model using a given design matrix, you generally want to conduct statistical tests based on the model. If one of the coefficients represents the quantity you wish to test, then you can simply test that coefficient directly, and you have no need for contrasts. However, if you want to test for differences between two coefficients or some other more complicated relationship between coefficients, then you require a contrast. A contrast is nothing more than a simple arithmetic expression involving only the coefficients and constants. For example, if you have coefficients named A, B, and C, then "(A + B)/2 - C" is an example of a contrast. (Depending on what A, B, and C are, this contrast may or may not have a meaningful interpretation.) So it should be clear that any contrast is only defined in reference to a specific design matrix. Using the a contrast with a different design matrix than the one it was written for will either fail (if the dimensions/names don't match) or give a nonsensical result (if the dimensions and names happen to match by coincidence).

For more information, I highly recommend you read a textbook on linear regression. ISLR is a good one: http://www-bcf.usc.edu/~gareth/ISL/