Hi,
I am quite new to RNA analysis and currently, trying to analyze my RNA-seq data with 19 samples using DESeq2 package. My coldata has three batches alongside my phenotype. I would like to assign all four columns into the design function.
dds <- DESeqDataSetFromMatrix(countData = X[,2:20],
colData = as.data.frame(coldat),
design = ~ Phenotype + Sequencer+ Bacth1+ Batch2)
Sample Phenotype Sequencer Batch1 Batch2
S1 S 1 in AS
S2 C 1 in AM
S3 C 1 in AM
S4 C 1 in AM
S5 C 1 in AM
S6 C 1 in AM
S7 C 1 in AM
S8 C 1 in AM
S9 S 1 in MP
S10 C 1 in MA
S11 C 1 in MA
S12 C 1 in MA
S13 C 1 in MA
S14 C 1 in MA
S15 S 2 out RS
S16 S 2 out RH
S17 S 2 out AS
S18 S 2 out RH
S19 S 2 out RH
However, as expected, it produces a matrix where some levels are "without samples". I have followed the instruction of “Model matrix not full rank” in the vignette(DESeq2) and produced a matrix and removed the all zero columns.
cd <- model.matrix(~ Phenotype*Sequencer*Batch1*Batch2, coldat)
all.zero <- apply(cd, 2, function(x) all(x==0))
idx <- which(all.zero)
cd <- cd[, -idx]
I try to integrate the produced matrix into "full" argument of DESeq like this:
dds <- DESeqDataSetFromMatrix(countData = X[,2:20],
colData = as.data.frame(coldat),
design = ~ Phenotype + Sequencer)
dds <- DESeq(dds, test = "Wald", betaPrior = FALSE, full =design(cd), reduced = ~ Phenotype+Batch1+Batch2)
However, I am receiving this error
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘design’ for signature ‘"matrix"’
I am not sure how I have to proceed, so appreciate it if someone can help me in this regard.
Hi Michael,
Thank you for the comment. However, my question is, if I change the
full=design(dds)
how willcd
be integrated into theDESeq
. I still get an error when I try to add three variables (Batch1 can be ignored) and if I put the design(dds) then only Phenotype and Sequencer will be counted. In this case, Batch1 and Batch2 is not included in the design.I have also tried the following:
which gives me this error:
I really appreciate your feedback. Thank you
I’d recommend working with a statistician or someone familiar with linear models. You can either provide full rank matrices or design formula to these two arguments. It’s best to just discuss how to form these with someone who has experience with linear models. Don’t just rely on the software not giving an error to presume that the inference is appropriate for your hypotheses.
Thank you very much for the explanation and your time.