Model matrix not full rank for condition-time experiment
Entering edit mode
sally_b86 • 0
Last seen 5.5 years ago

Dear Michael, 

Having this data, I want to study the effect of sham over time (each time point 8 replicates), unfortunately its giving error: Model matrix not full rank. Any solutions?

Sample time condition
Ctl.1 t0 control
Ctl.2 t0 control
Ctl.3 t0 control
Ctl.4 t0 control
Ctl.5 t0 control
Ctl.6 t0 control
Ctl.7 t0 control
Ctl.8 t0 control
Sham45.1 t1 sham
Sham45.2 t1 sham
Sham45.3 t1 sham
Sham45.4 t1 sham
Sham45.5 t1 sham
Sham45.6 t1 sham
Sham45.7 t1 sham
Sham45.8 t1 sham
Sham24.1 t2 sham
Sham24.2 t2 sham
Sham24.3 t2 sham
Sham24.4 t2 sham
Sham24.5 t2 sham
Sham24.6 t2 sham
Sham24.7 t2 sham
Sham24.8 t2 sham
H45min.1 t1 non ischemic
H45min.2 t1 non ischemic
H45min.3 t1 non ischemic
H45min.4 t1 non ischemic
H45min.5 t1 non ischemic
H45min.6 t1 non ischemic
H45min.7 t1 non ischemic
H45min.8 t1 non ischemic
HR24hr.1 t2 non ischemic
HR24hr.2 t2 non ischemic
HR24hr.3 t2 non ischemic
HR24hr.4 t2 non ischemic
HR24hr.5 t2 non ischemic
HR24hr.6 t2 non ischemic
HR24hr.7 t2 non ischemic
HR24hr.8 t2 non ischemic
Model matrix not full rank deseq2 • 1.4k views
Entering edit mode
Last seen 3 days ago
United States

The vignette discusses the source of the problem in a special section (the error should have pointed you to this section):

Let's start from the beginning, what genes are you interested in finding here? What is the role of the non ischemic samples in your test of interest?

Entering edit mode

Okey, I have in this case 3 conditions, control (no surgery), sham (surgery at 2 times, 45 min and 24 hrs) and non ischemic samples (mice with surgery and Ischemia reperfusion). I want to find out the genes varying by condition over time. 

I have already done LRT test on each condition alone on the 2 time points, and I checked for the co-expressed genes between sham and non-ischemic samples. But I thought that it can be easier to achieve so by directly checking genes varying in each condition over time by interaction of condition and time.

Here I just have two conditions but later I have ischemic samples also, so when I did LRT on each condition alone, it was complicated to separate genes that are coexpressed.

Entering edit mode

"I want to find out the genes varying by condition over time."

Do you want to find genes that vary by condition over time, meaning, comparing the non ischemic to the sham samples? So genes where the profile over time differs?

Is there a relationship of the 1-8 samples at the two time points? Is sample 1 = sample 1 at time point 1 and time point 2?

Entering edit mode

Yes exactly this is what I want to compare and therefore the control is useless here, I used it in LRT of all conditions to have the same starting point. And this is the problem if I remove the control I won't be taking into consideration the changes from 0 to 45 min.

Samples 1 in all conditions are sampled in the first week same for 2 in the second week..  

Entering edit mode

I'm not following exactly the point about samples 1-8. Does Sham45.1 have a special relationship to Sham24.1, etc.?

Entering edit mode

No direct relation between the samples but what Im saying that these mice are sacrificed in the same week. I have 8 series of mice each one constitute of certain time points sacrificed in the same week. So no relation between the samples. Im just explaining the naming 1, 2... 

Entering edit mode

Starting a new comment thread for visual simplicity...

You can use a design ~time + condition:time, but you need to recode "control" as "sham". Then there is an issue with using the design as formula (so when you specify the design just use ~time or anything here). You have to remove a column from the model matrix like so:

full <- model.matrix(~time + condition:time, colData(dds))

You will then need to remove the column with all 0's.

full <- full[ , -x]

...where 'x' is the number of the column with all 0's.

For testing whether there is a difference between sham and non ischemic, you can make a matrix 'reduced' where you also remove from 'full' the last two columns that are associated with time 1 and 2 and with non ischemic samples.

Then use:

dds <- DESeq(dds, test="LRT", full=full, reduced=reduced)



Login before adding your answer.

Traffic: 494 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6