Question

Deseq paired samples

0

Entering edit mode

Alexandre ▴ 10

@095e334e

Last seen 2.5 years ago

Hong Kong

Hello,

I would like to perform a pair analysis in Deseq2 and I read the vignette and previous questions here but still I'm not sure if my matrix is correct.

SampleID Factor Condition Replicate
sample1 eprint  siNeg   1
sample2 eprint  siNeg   2
sample3 eprint  siFUS   1
sample4 eprint  siFUS   2
sample5 input   siNeg   1
sample6 input   siNeg   2
sample7 input   siFUS   1
sample8 input   siFUS   2

Consider the following table, I would like to have the following pairs :

sample1 and sample5

sample2 and sample6

sample3 and sample7

sample4 and sample8

I'm not sure if my Replicate column is correctly accounting for the paired samples if coded this way. Ultimately my final model should be the following:

~Replicate+Factor+Condition

DESeq2 • 1.5k views

ADD COMMENT • link updated 3.4 years ago by Michael Love 43k • written 3.5 years ago by Alexandre ▴ 10

score 0 · Answer 1 · 2021-07-16

0

Entering edit mode

swbarnes2 ★ 1.4k

@swbarnes2-14086

Last seen 1 day ago

San Diego

You can't compare one sample to one sample in DESeq2. The experimental design here is ...not great.

ADD COMMENT • link 3.5 years ago swbarnes2 ★ 1.4k

0

Entering edit mode

Right, if you want to compute four p-values for four different effects, this is not possible (with any linear model) because the model is saturated.

If you assume that the effect is the same across two or more samples, you can fit the model with a design like ~pair + condition.

ADD REPLY • link 3.5 years ago Michael Love 43k

0

Entering edit mode

Hi Michael, thanks for your answer. Is my model correct if only define ~Replicate+Condition. I'm not sure if I coded the Replicate column correctly.

ADD REPLY • link 3.5 years ago Alexandre ▴ 10

0

Entering edit mode

I would like to design an experiment similar to this question DESEq2 Paired samples Before and after treatment

ADD REPLY • link 3.5 years ago Alexandre ▴ 10

0

Entering edit mode

I don't know what you are doing so can't answer your question, you haven't specified how many effects you want to estimate here, or what you are trying to do.

Is it one condition effect across all samples, and you want to control for differences due to replicate or factor?

ADD REPLY • link 3.5 years ago Michael Love 43k

0

Entering edit mode

I would like to control for differences in factor and condition while at the same time , trying to code the replicate column which in this case could also be called pair column and provide the information to the linear model that I have a pair of samples. The SampleID pairs are not the same individual unit that are being measured over time, so I don't know if it makes sense to code this way.

ADD REPLY • link 3.5 years ago Alexandre ▴ 10

0

Entering edit mode

This has been asked before on the support site, but probably difficult to find those questions.

You can control for replicate, but not also for factor, because replicate is nested within factor. Controlling for replicate will control for factor. This happens also when users want to control for donor _and_ sex. As donors are nested within sex, controlling for donor is sufficient, and somehow trying to additionally control for sex doesn't make sense.

You can estimate a single condition effect from all eight samples (if you don't want to do this, I still haven't gathered from your comments how many effects you expect to estimate here).

You should code replicate as c(1:4,1:4) and then use ~replicate + condition.

ADD REPLY • link 3.5 years ago Michael Love 43k

0

Entering edit mode

Hi Michael, thanks for your answer. My goal is to estimate the difference of differences (sifus - sineg) - (eprint - input) , perhaps I should just go about and model like that ~Factor*Condition. I don't quite follow why replicate is nested within factor. I thought they were crossed since each level of one factor occurs in combination with each level of the other factor. thanks

ADD REPLY • link 3.5 years ago Alexandre ▴ 10

0

Entering edit mode

I don't follow what you're trying to do here, I'd recommend to work with a statistician to figure out the design and interpretation of coefficients.

ADD REPLY • link 3.5 years ago Michael Love 43k

0

Entering edit mode

Hi Michael,

sorry one last question. I've tried the model you recommended ~replicate + condition after coding the replicate c(1:4,1:4) and I got a full rank matrix error.

Error in checkFullRank(modelMatrix) : 
  the model matrix is not full rank

ADD REPLY • link 3.4 years ago Alexandre ▴ 10

0

Entering edit mode

Oh if you are interested in comparing FUS vs NEG in the above sample order, you should define replicate as factor(c(1,2,1,2,3,4,3,4)) assuming that these number indicate the sample IDs across the two conditions FUS and NEG.

ADD REPLY • link 3.4 years ago Michael Love 43k