Entering edit mode
Hello,
I would like to perform a pair analysis in Deseq2
and I read the vignette and previous questions here but still I'm not sure if my matrix is correct.
SampleID Factor Condition Replicate
sample1 eprint siNeg 1
sample2 eprint siNeg 2
sample3 eprint siFUS 1
sample4 eprint siFUS 2
sample5 input siNeg 1
sample6 input siNeg 2
sample7 input siFUS 1
sample8 input siFUS 2
Consider the following table, I would like to have the following pairs :
sample1
and sample5
sample2
and sample6
sample3
and sample7
sample4
and sample8
I'm not sure if my Replicate
column is correctly accounting for the paired samples if coded this way. Ultimately my final model should be the following:
~Replicate+Factor+Condition
Right, if you want to compute four p-values for four different effects, this is not possible (with any linear model) because the model is saturated.
If you assume that the effect is the same across two or more samples, you can fit the model with a design like
~pair + condition
.Hi Michael, thanks for your answer. Is my model correct if only define
~Replicate+Condition
. I'm not sure if I coded theReplicate
column correctly.I would like to design an experiment similar to this question DESEq2 Paired samples Before and after treatment
I don't know what you are doing so can't answer your question, you haven't specified how many effects you want to estimate here, or what you are trying to do.
Is it one condition effect across all samples, and you want to control for differences due to replicate or factor?
I would like to control for differences in factor and condition while at the same time , trying to code the replicate column which in this case could also be called pair column and provide the information to the linear model that I have a pair of samples. The
SampleID
pairs are not the same individual unit that are being measured over time, so I don't know if it makes sense to code this way.This has been asked before on the support site, but probably difficult to find those questions.
You can control for replicate, but not also for factor, because replicate is nested within factor. Controlling for replicate will control for factor. This happens also when users want to control for donor _and_ sex. As donors are nested within sex, controlling for donor is sufficient, and somehow trying to additionally control for sex doesn't make sense.
You can estimate a single condition effect from all eight samples (if you don't want to do this, I still haven't gathered from your comments how many effects you expect to estimate here).
You should code replicate as
c(1:4,1:4)
and then use~replicate + condition
.Hi Michael, thanks for your answer. My goal is to estimate the difference of differences
(sifus - sineg) - (eprint - input)
, perhaps I should just go about and model like that~Factor*Condition
. I don't quite follow whyreplicate
is nested withinfactor
. I thought they were crossed since each level of one factor occurs in combination with each level of the other factor. thanksI don't follow what you're trying to do here, I'd recommend to work with a statistician to figure out the design and interpretation of coefficients.
Hi Michael,
sorry one last question. I've tried the model you recommended
~replicate + condition
after coding the replicatec(1:4,1:4)
and I got a full rank matrix error.Oh if you are interested in comparing FUS vs NEG in the above sample order, you should define
replicate
asfactor(c(1,2,1,2,3,4,3,4))
assuming that these number indicate the sample IDs across the two conditions FUS and NEG.