I'm very new to edgeR, so this may be a silly question... I have a CRISPR screen with the following design:
+ conditioned media | - conditioned media | |
+ drug | d.cm | d.ncm |
- drug | nd.cm | nd.ncm |
There are three biological replicates of each of the four conditions. Two of those replicates were sequenced at a different time than the other replicate, and there is a significant batch effect from the different sequencing runs.
I'm mainly interested in how the conditioned media affects which guides are enriched/depleted after treatment with drug. I've thought of two ways to do this analysis:
1) Split my data into two sets, one for +cm and one for -cm. So my samples would be:
cm$samples$group [1] Conditioned.Media Conditioned.Media Conditioned.Media [4] Drug.10nM.Conditioned.Media Drug.10nM.Conditioned.Media Drug.10nM.Conditioned.Media ncm$samples$group [1] Untreated Untreated Untreated Drug.10nM Drug.10nM Drug.10nM
Then my design matrices are as follows (setting seqDate as intercept to account for batch effect):
desCM <- model.matrix(~seqDate + group, data=cm) desNCM <- model.matrix(~seqDate + group, data=ncm)
desCM (Intercept) seqDatet2 groupDrug.10nM.Conditioned.Media 7 1 1 0 1 1 1 0 11 1 0 0 9 1 1 1 3 1 1 1 13 1 0 1
desNCM (Intercept) seqDatet2 groupDrug.10nM 6 1 1 0 17 1 0 0 10 1 0 0 8 1 1 1 2 1 1 1 12 1 0 1
and I make the following contrasts (which, if I understand correctly, should compare d.cm - nd.cm for the first one and d.ncm - nd.ncm for the second one):
lrtCM = glmLRT(fitCM, coef=3) lrtNCM = glmLRT(fitNCM, coef=3)
2) Keep everything together and only make the +/- drug contrasts in the context of + or - cm. Here my samples are:
y$samples$group [1] Untreated Untreated Untreated [4] Conditioned.Media Conditioned.Media Conditioned.Media [7] Drug.10nM.Conditioned.Media Drug.10nM.Conditioned.Media Drug.10nM.Conditioned.Media [10] Drug.10nM Drug.10nM Drug.10nM
my design matrix is:
design <- model.matrix(~seqDate + group, data = y$samples) (Intercept) seqDatet2 groupDrug.10nM groupDrug.10nM.Conditioned.Media groupConditioned.Media 6 1 1 0 0 0 17 1 0 0 0 0 10 1 0 0 0 0 7 1 1 0 0 1 1 1 1 0 0 1 11 1 0 0 0 1 9 1 1 0 1 0 3 1 1 0 1 0 13 1 0 0 1 0 8 1 1 1 0 0 2 1 1 1 0 0 12 1 0 1 0 0
and I make the following contrasts (the first should compare d.ncm - nd.ncm, and the second should compare d.cm - nd.cm):
lrtNCM <- glmLRT(fit, coef=3) lrtCM <- glmLRT(fit, contrast=c(0,0,0,1,-1))
I feel like these should be equivalent, but based on the topTags, they are not and I don't understand why.
Why are they different? Which is more correct to understand how the conditioned media changes the response to the drug?