Question

Error Message in DESEQ (DESEQDataSetFromMatrix)

0

Entering edit mode

huck.thornton ▴ 10

@huckthornton-23681

Last seen 4.4 years ago

Hi all,

I am trying to run DESEQ on a merged counts file generated by HTSEQ.

I am receiving the follow error message:

Error in validObject(.Object) : 
  invalid class SummarizedExperiment object: 'colData' nrow differs from 'assays' ncol
In addition: Warning message:
In sort(rownames(colData)) == sort(colnames(countData)) :
  longer object length is not a multiple of shorter object length

My code is:

setwd("xxx")

countData <- read.table("name_output_expression_matrix_full.out.csv")
head(countData)

colData=read.csv("samples_test.csv")
head(colData)

dds <- DESeqDataSetFromMatrix(countData=countData, 
                              colData=colData, 
                              design=~condition)

deseq2 • 2.5k views

ADD COMMENT • link updated 4.4 years ago by Michael Love 43k • written 4.4 years ago by huck.thornton ▴ 10

Kevin Blighe · Answer 1 · 2020-08-08

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 7 minutes ago

United States

Check your input. The counts and information about counts (colData) need to line up or else you will get meaningless results.

ADD COMMENT • link 4.4 years ago Michael Love 43k

0

Entering edit mode

Right now nrow(colData) is 6 and ncol(countData) is 7. Is this where the issue is?

ADD REPLY • link 4.4 years ago huck.thornton ▴ 10

0

Entering edit mode

You have 7 samples of counts (columns of the matrix) and sample information about 6 samples. What would you expect a software package do at this point?

ADD REPLY • link 4.4 years ago Michael Love 43k

0

Entering edit mode

It's actually one column of gene id (e.g. "ENSG00000090661.11") with six additional columns of six samples.

Is there a way to make it work?

ADD REPLY • link 4.4 years ago huck.thornton ▴ 10

0

Entering edit mode

Of course it can work. You need to make the gene names row names, not a regular column of data.

ADD REPLY • link 4.4 years ago swbarnes2 ★ 1.4k

0

Entering edit mode

How does one proceed to alter the code to achieve this? Apologies. I am a novice with this.

ADD REPLY • link 4.4 years ago huck.thornton ▴ 10

0

Entering edit mode

It should be a two-step operation of: 1) assigning the rownames to the data-frame, and, 2) then removing that column (gene names) from the main data-frame. It would be a good habit to look up how to do it.

ADD REPLY • link 4.4 years ago Kevin Blighe ★ 4.0k

0

Entering edit mode

I entered the following code:

library( "DESeq2" )
library(ggplot2)
library(dplyr)

setwd("xxx")

countData <- read.table('name_output_expression_matrix_full.out')
countData <- select(countData, -1)
head(countData)

colData=read.csv("samples_test.csv", row.names=1)
head(colData)

dds <- DESeqDataSetFromMatrix(countData=countData, 
                              colData=colData, 
                              design=~condition)

And received the following error message: "Error in round(assay(se)) : non-numeric argument to mathematical function".

Thoughts?

ADD REPLY • link updated 4.4 years ago by Kevin Blighe ★ 4.0k • written 4.4 years ago by huck.thornton ▴ 10

0

Entering edit mode

What is the output of str(countData)?

ADD REPLY • link 4.4 years ago Kevin Blighe ★ 4.0k

0

Entering edit mode

The result of str(countData) is:

'data.frame':   58782 obs. of  6 variables:
 $ V2: Factor w/ 3809 levels "0","1","10","100",..: 3809 3538 1 2942 2 1 1 1553 1228 1 ...
 $ V3: Factor w/ 2861 levels "0","1","10","100",..: 2861 2083 1 1927 2 1 1 311 2267 1 ...
 $ V4: Factor w/ 2266 levels "0","1","10","100",..: 2266 1190 1 664 1 1 1 482 1209 1 ...
 $ V5: Factor w/ 2884 levels "0","1","10","100",..: 2884 2849 1 2157 1 1 1 637 2 1 ...
 $ V6: Factor w/ 3779 levels "0","1","10","100",..: 3779 3625 1 200 2439 1 1 3538 3460 1 ...
 $ V7: Factor w/ 3338 levels "0","1","10","100",..: 3338 194 1 2948 2441 1 1 113 3 1 ...

ADD REPLY • link updated 4.4 years ago by Kevin Blighe ★ 4.0k • written 4.4 years ago by huck.thornton ▴ 10

0

Entering edit mode

So, there's the problem. Somewhere in your workflow, your data, which should be numeric, was converted to factors. You'll have to go back to re-trace your steps.

ADD REPLY • link 4.4 years ago Kevin Blighe ★ 4.0k

0

Entering edit mode

Right now nrow(colData) is 6 and ncol(countData) is 7. Is this where the issue is?

ADD REPLY • link 4.4 years ago huck.thornton ▴ 10