Hi,
I have an experiment that contains 3 treatments and a technical effect consisting of three batches.
treatment <- c("a","a","a","b","b","b","c","c","c") batch <-c("i","ii","iii","i","ii","iii","i","ii","iii")
I realise that looks like a paired sample design, it's just for simplicity.
design <- model.matrix(~0 + as.factor(treatment) + as.factor(batch))
When I run the above code, I get a design matrix with 5 column names, all three of my treatments, then only two of my batches. I'm not sure why this is happening, I assumed that because I've added zero at the beginning of my model design, that stops terms from being absorbed into the intercept, can anyone shed any light on why I'm only seeing 5 terms?
I realise I could add in the batch as an "extra" variable using:
design <- model.matrix(~0 + as.factor(treatment), as.factor(batch))
but ultimately, I'd like to include multiple variables that could account for technical variation.
Cheers
Indeed, if your design were to have all three batch terms and all three treatments, it would not be of full rank:
There are linear dependencies between your columns in this design matrix. More specifically, the row sums of the first 3 columns in
bad.design
gives the same vector (all-ones) as the row sums of the last 3 columns. This means that you'd have infinite solutions for the fitted value vector; you could just increase the estimates of the first 3 coefficients by any value, so long as you decrease the estimates of the last three by the same value. Obviously, this is not ideal, as we want one solution.James, Aaron, thanks for your explanations. That helps clear things up. So if I were to make contrasts and do an lm fit, as per below, that should look at 'a' against 'b', but control for the batch effect?
That is correct.
The gist of your code is correct. However, you should set
levels=design
in the call tomakeContrasts
, and rename the first three columns ofdesign
toc('a','b','c')
.Already got that in there, but thanks for the clarification :)