Search
Question: Problem making DESeq dataset; Error all variables in design formula must be columns in colData
0
gravatar for wannes.nauwynck
2.1 years ago by
wannes.nauwynck0 wrote:

Hi,

I just started a differential expression analysis using DESeq2 with a count dataset. There are two cellines in which I want to detect the DE ; a radiated one and a control one. Now I have used DESeq2 and DESeqDataSetFromMatrix before and they have always worked fine for me, but now for some reason when I want to run DESeqDataSetFromMatrix, the function displays an error message. These are my inputs

head(counts)

         X01 X02 X03 X04
A1BG     241  48 225 129
A1BG-AS1  46  14  34  45
A1CF      28   5  18  28
A2M        2   0   1   0
A2M-AS1    0   0   0   0
A2ML1     11   1   1   4

head(mycols)    #coldata

condition = as.factor(c(rep("Ctr",2),rep("Irr",2)))
mycols = data.frame(row.names = c("X01","X02","X03","X04"),condition)
mycols
    condition
X01       Ctr
X02       Ctr
X03       Irr
X04       Irr

>dsd = DESeqDataSetFromMatrix(countData = counts,colData = mycols, design ~ condition) 

Error in DESeqDataSet(se, design = design, ignoreRank) :
  all variables in design formula must be columns in colData

I don't get error at all, as far as I know, all variables in the design formula (condition here) ARE included as a column in the colData!

I've always done it this way and it's the first time I received this error so I don't know what to do here.

Any help would be greatly appreciated, thanks!!

ADD COMMENTlink modified 2.1 years ago by Michael Love17k • written 2.1 years ago by wannes.nauwynck0
3

Hello!

Is it possible that you forgot the "=" in 

DESeqDataSetFromMatrix(countData = counts,colData = mycols, design        =       ~ condition) 

?

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Radek40

haha, yes that did the trick, God I'm dumb! Thanks so much for your comment!

ADD REPLYlink written 2.1 years ago by wannes.nauwynck0
0
gravatar for Michael Love
2.1 years ago by
Michael Love17k
United States
Michael Love17k wrote:

Already answered by Radek in comment.

While not the most helpful error message here, the reason that error was thrown is because "design" is in your design = design ~ condition, and obviously not a variable.

ADD COMMENTlink written 2.1 years ago by Michael Love17k

Hello, 

I got the same error "Error in DESeqDataSet(se, design = design, ignoreRank) : 
  all variables in design formula must be columns in colData"

my code is:

fulldds <- DESeqDataSetFromMatrix(countData = cts, colData = mat2, design = ~ cnd)

I am trying to estimate group effect using a nested model followin the tutorial in https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#differential-expression-analysis.

Could you give me some advice?

cheers, 

Luca

ADD REPLYlink written 4 months ago by luca.rugiu0

Let's parse the error message. Your design formula is ~cnd, and has one variable "cnd". The error says "all variables in design formula must be columns in colData", i.e. the one variable in the design formula, "cnd", needs to be a column in colData, which is mat2 here. It has to be the exact name, R and DESeq2 can't guess which column you are referring to as "cnd" unless it's exactly the same name. Is "cnd" a column in colData (mat2)?

ADD REPLYlink written 4 months ago by Michael Love17k

Thanks a lot for the quick reply.

colnames(mat2)

(Intercept)"         "grpSeili"            "grpRauma:ind.nb"     "grpSeili:ind.nb"   "grpRauma:ind.nc"     "grpSeili:ind.nc"     "grpRauma:ind.nd"     "grpRauma:cndpresent"  "grpSeili:cndpresent"

 

when trying to replace cond with one of the colnames of mat2 I get the same error:

fulldds <- DESeqDataSetFromMatrix(countData = cts, colData = mat2, design = ~ grpSeili:cndpresent)

ADD REPLYlink written 4 months ago by luca.rugiu0

There's an issue here. It looks like mat2 here is the output of model.matrix. It's best if you put the original variables into colData, that is "grp" and "ind". You'll have lots of problems unless colData contains the actual variables.

ADD REPLYlink written 4 months ago by Michael Love17k

I see. 

I found from the link I previously provided that nested designs need some extra coding. 

In that guide, this is suggested:

model.matrix(~ grp + grp:ind.n + grp:cnd, coldata), so that nested effects among groups could be taken into account. From this point, how do I get to DE analysis if I can't use the model.matrix just created?

 

ADD REPLYlink written 4 months ago by luca.rugiu0

You supply the model matrix directly to the "full" argument:

dds <- DESeq(dds, full=full)
ADD REPLYlink written 4 months ago by Michael Love17k

ok, it works!

thanks a lot for your help and patience!

have a good day

-Luca

ADD REPLYlink written 3 months ago by luca.rugiu0

Also, I had to type "colnames(cts) <- NULL" to avoid an error as found out in an other post, so the colnames of my countData are now null. Not sure this is relevant...

        [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
TRINITY_DN293388_c0_g1    0    7    0    1    0    0    0    3    0     0     2     0     0     1
TRINITY_DN206683_c1_g2    0    0  102    0    0    0    0    0    0   442     0     0     0     0
TRINITY_DN217091_c0_g3  123  617  161  123  106  209  102  564  131   287   483   218   473   620
TRINITY_DN216710_c4_g1  106  141   92  120  156  122  211  135  116   127   131    50   166   119
TRINITY_DN269500_c0_g1    0    3    0    0    0    0    0    0    0     0     0     0     0     6
TRINITY_DN219001_c0_g2   82  416  235  102   87   77   90  414  120   276   459    79   369   328

 

 

ADD REPLYlink written 4 months ago by luca.rugiu0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 295 users visited in the last hour