Hi,
I am trying to build a model for DESeq2 analyses. In past I have done simpler version of analyses where the model was not so complicated. I have few questions and I will sincerely appreciate if someone from his/her experience can answer them (ofcourse the answers from Micheal Love are very Welcome).
So I have a metadata that has Gender (M/F), BMI(H/OB), AGE(Y/O), DiseaseScore(I/S) STATUS(DR/DNR/PR/PNR)
When I keep things simpler in my metadata like the above where all my categorical variables have only two binary outcomes and build a model like below
Here I am trying to control age, gender, bmi etc effect on my data analyses and at the same time get the interaction between STATUS (DR) and BMI,
dds <- DESeqDataSetFromMatrix(countData=cts,colData=meta,design=~ Batch + AGE + Gender + BMI + DiseaseScore + STATUS +
BMI:STATUS
The above code works fine. However, when I introduce three outcomes for BMI like (H/OW/OB) I get an error
converting counts to integer mode
Error in checkFullRank(modelMatrix) :
the model matrix is not full rank, so the model cannot be fit as specified.
One or more variables or interaction terms in the design formula are linear
combinations of the others and must be removed.
Please read the vignette section 'Model matrix not full rank':
vignette('DESeq2')
In addition: Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
some variables in design formula are characters, converting to factors
I looked into the manual and also some of the related questions but I am unable to get my head around:-
Q1. why it is not working with three or more outcomes for a metadata column and what can I do to fix the code?
Once the model runs with the above code where I do not make things complicated and keep the metadata columns with only 2 outcomes with following commands.
dds=DESeq(dds,test="LRT", reduced=~Gender+Batch+BMI+DiseaseScore)
resultsNames(dds)
[1] "Intercept" "Batch_II_vs_I" "Batch_III_vs_I" "Batch_IV_vs_I"
[5] "Batch_V_vs_I" "Batch_VI_vs_I" "Batch_VII_vs_I" "AGE.2_Y_vs_O"
[9] "Gender_M_vs_F" "BMI_H_vs_F" "DiseaseScore_G_vs_B" "STATUS_DR_vs_DNR"
[13] "STATUS_PNR_vs_DNR" "STATUS_PR_vs_DNR" "BMIH.STATUSDR" "BMIH.STATUSPNR"
[17] "BMIH.STATUSPR"
Now to get the DEGs between DR vs DNR I use following line of code. The aim is to get difference between DR vsDNR for BMI's that are in H group.
res.h=results(dds, contrast=list(c("STATUS_DR_vs_DNR","BMIH.STATUSDR"))
Q2 Is this the right approach?
Q3. How can I get the DEGS between DR vs DNR BMI's that are in F group (I do not see BMIF....... in resultNames(dds))?
Thank you in advance for your time .
As asked, here is a snapshot of metadata
STATUS Batch Gender AGE BMI DiseaseScore
PNR III M Y H B
PNR III M Y H G
PNR III M Y H B
DNR IV M Y H B
PR IV M O F G
PR I F Y H G
DNR V F O F B
DR II F Y H G
PNR V M Y H B
DNR V M Y H G
DR I M Y H B
DR I M Y H B
DR I M Y F G
PNR VI F Y H B
DNR I M Y H G
PR III M O H G
DNR I M Y H G
As a note when I incorporate DiseaseScore (which has binary outcome), I get an error as stated above.
The question has been updated, I do not understand how it is nested? and if it is then how can I fix it?