DESeq2 analysis with a continuous covariate
1
0
Entering edit mode
choiahy • 0
@choiahy-12144
Last seen 7.2 years ago

Hi I'm having some trouble figuring out how to work with continuous covariate. 

The manual states that continuous covariates should be treads in a same manner as factorial covariates, but R seems to not like my code. 

I have 3 patients, 2 replicates, total of 6 samples. 

This is how my sampleTable looks like:

sample fileName patient condition nested treatment_time
A_untreated A_untreated A untreated 1 30
A_treated A_treated A treated 1 30
B_untreated B_untreated B untreated 2 45
B_treated B_treated B treated 2 45
C_untreated C_untreated C untreated 3 41
D_treated D_treated C treated 3 41

 

My goal is to perform a paired analysis, comparing between two condition of the same individual patient, including treatment time as continuous covariate.

 

Below is my code: 

sampleTable <- data.frame(sample=samples$Sample, fileName=samples$Sample, patient = samples$Patient, condition=samples$Condition, nested=samples$Nested,treatment_time=samples$treatment_Time)
sampleTable$patient <- relevel(sampleTable$patient, ref="A")
sampleTable$condition <- relevel(sampleTable$condition, ref="untreated")

directory = "PATH"

dds <- DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design= ~1)

model = model.matrix(~nested:condition + ischemic_time:condition + condition, colData(dds))

 

I get an error as following:

Error in checkForExperimentalReplicates(object, modelMatrix) : 

  same number of samples and coefficients to fit with supplied model matrix

 

I have a couple of questions:

1) I don't understand why my model is making an error.

2) For the DE results, what resultsNames should I compare to accomplish the goal? 

 

 

Thank you very much in advance. 

 

 

DEseq2 continuous covariate • 2.0k views
ADD COMMENT
1
Entering edit mode
Gavin Kelly ▴ 680
@gavin-kelly-6944
Last seen 3.9 years ago
United Kingdom / London / Francis Crick…

It looks like 'nested' and 'time' (assuming treatment_time and ischemic_time are the same thing) are 100% correlated, so putting both in the model is going to lead to problems in estimation.  A correct model to test for e.g. difference between treated and untreated would be ~patient + condition and the test for the condition coefficient in the model.  A correct model for e.g. looking at different linear expression time-profiles between treated and untreated  would be ~condition*time, but neither of these would be able to distinguish what was an effect due to systematic patient differences and what was an effect due to time-dependent changes in gene expression.  For that, the experiment should have been designed either with patients coming back at multiple time-points, or all patients having a common time-point (or ideally both).

 

ADD COMMENT
0
Entering edit mode

Yes, and to follow up, you should just use a design of ~patient + condition, if you are just interested in testing the effect of condition.

As a 'nuisance parameter' patient here controls for both patient effect and difference due to treatment time, because they are confounded.

ADD REPLY

Login before adding your answer.

Traffic: 871 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6