I have a question about taking into account technical replication in my model. I will greatly appreciate any help from your side!
About my data set:
These are proteomics data (originating from TMT-labeling experiment), so each row corresponds to a particular protein and columns contain log-transformed protein intensities for each replicate.
There are in total 8 biological replicates, 4 treated and 4 control.
4 samples (2 treated and 2 control) were processed and measured in one experiment and 4 other samples in another experiment, thus creating a batch effect that has to be taken into account.
Furthermore, each sample was measured twice on two different mass spectrometers using two different acquisition methods. First method (MS2) is more sensitive resulting in a greater number of proteins quantified as compared to another method (MS3) which can quantify less proteins but with higher accuracy. It means, in each biological replicate > 70% proteins will have intensities originating from MS2 and MS3 acquisition methods, and < 30% proteins being quantified in either of those two methods.
I would like to incorporate the technical replication in my model taking into account the batch effect but I am not sure how to properly set up the design matrix for this.
Say, I have following levels in the data:
Acquisition | MS2 | MS3 | Experiment | 1 | 2 | 1 | 2 | Treatment | Tr | Con | Tr | Con | Tr | Con | Tr | Con | Replicate | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
tr <- as.factor(rep(c(2,2,1,1), 4)) # treatment, 2 = Treated, 1 = Control ex <- as.factor(c(rep(1,4), rep(2,4), rep(1,4), rep(2,4))) # experiment ms <- as.factor(c(rep(1,8), rep(2,8))) # acquisition method, 1 = MS2, 2 = MS3
Is it correct to define block variable and design matrix as follows?
block <- c(1:8, 1:8) design <- model.matrix(~ ex + ms + tr) dupcor = duplicateCorrelation(dat, design = design, block = block) fit <- lmFit(dat, design, block = block, correlation = dupcor$consensus)
I am not sure if I have to put both ex and ms into the model and if I have to use the design matrix in duplicateCorrelation function. Could you please correct me if I am wrong?
Thank you very much for your help!