DESeq2: how to integrate a manually edited matrix into full argument of DESeq
1
1
Entering edit mode
khaveh ▴ 10
@090c19a5
Last seen 3 days ago
Berlin, Germany

Hi,

I am quite new to RNA analysis and currently, trying to analyze my RNA-seq data with 19 samples using DESeq2 package. My coldata has three batches alongside my phenotype. I would like to assign all four columns into the design function. dds <- DESeqDataSetFromMatrix(countData = X[,2:20], colData = as.data.frame(coldat), design = ~ Phenotype + Sequencer+ Bacth1+ Batch2)

Sample  Phenotype   Sequencer   Batch1  Batch2
S1      S           1           in      AS
S2      C           1           in      AM
S3      C           1           in      AM
S4      C           1           in      AM
S5      C           1           in      AM
S6      C           1           in      AM
S7      C           1           in      AM
S8      C           1           in      AM
S9      S           1           in      MP
S10     C           1           in      MA
S11     C           1           in      MA
S12     C           1           in      MA
S13     C           1           in      MA
S14     C           1           in      MA
S15     S           2           out     RS
S16     S           2           out     RH
S17     S           2           out     AS
S18     S           2           out     RH
S19     S           2           out     RH

However, as expected, it produces a matrix where some levels are "without samples". I have followed the instruction of “Model matrix not full rank” in the vignette(DESeq2) and produced a matrix and removed the all zero columns.

cd <- model.matrix(~ Phenotype*Sequencer*Batch1*Batch2, coldat)

all.zero <- apply(cd, 2, function(x) all(x==0))
idx <- which(all.zero)
cd <- cd[, -idx]

I try to integrate the produced matrix into "full" argument of DESeq like this:

dds <- DESeqDataSetFromMatrix(countData = X[,2:20],
                          colData = as.data.frame(coldat),
                          design = ~ Phenotype + Sequencer)
dds <- DESeq(dds, test = "Wald", betaPrior = FALSE, full =design(cd), reduced = ~ Phenotype+Batch1+Batch2)

However, I am receiving this error

Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘design’ for signature ‘"matrix"’

I am not sure how I have to proceed, so appreciate it if someone can help me in this regard.

RNASeqData DESeq2 • 158 views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 2 days ago
United States

I think you meant to write design(dds).

ADD COMMENT
0
Entering edit mode

Hi Michael,

Thank you for the comment. However, my question is, if I change the full=design(dds) how will cd be integrated into the DESeq. I still get an error when I try to add three variables (Batch1 can be ignored) and if I put the design(dds) then only Phenotype and Sequencer will be counted. In this case, Batch1 and Batch2 is not included in the design.

dds <- DESeqDataSetFromMatrix(countData = X[,2:20],
                              colData = as.data.frame(coldat),
                              design = ~ Phenotype + Sequencer+Batch2)
converting counts to integer mode
Error in checkFullRank(modelMatrix) : 
  the model matrix is not full rank, so the model cannot be fit as specified.
  One or more variables or interaction terms in the design formula are linear
  combinations of the others and must be removed.

  Please read the vignette section 'Model matrix not full rank':

  vignette('DESeq2')

I have also tried the following:

dds <- DESeqDataSetFromMatrix(countData = X[,2:20],
                          colData = as.data.frame(coldat),
                          design = ~ Phenotype)

cd <- model.matrix(~ Phenotype + Sequencer + Batch2, colData(dds))

dds <- DESeq(dds, full=cd, betaPrior = FALSE)

which gives me this error:

using supplied model matrix
estimating size factors
estimating dispersions
gene-wise dispersion estimates
Error in solve.default(qr.R(qrx)) : 'a' (1 x 0) must be square

I really appreciate your feedback. Thank you

ADD REPLY
0
Entering edit mode

I’d recommend working with a statistician or someone familiar with linear models. You can either provide full rank matrices or design formula to these two arguments. It’s best to just discuss how to form these with someone who has experience with linear models. Don’t just rely on the software not giving an error to presume that the inference is appropriate for your hypotheses.

ADD REPLY
0
Entering edit mode

Thank you very much for the explanation and your time.

ADD REPLY

Login before adding your answer.

Traffic: 203 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6