Error Message in DESEQ (DESEQDataSetFromMatrix)
1
0
Entering edit mode
@huckthornton-23681
Last seen 3.7 years ago

Hi all,

I am trying to run DESEQ on a merged counts file generated by HTSEQ.

I am receiving the follow error message:

Error in validObject(.Object) : 
  invalid class SummarizedExperiment object: 'colData' nrow differs from 'assays' ncol
In addition: Warning message:
In sort(rownames(colData)) == sort(colnames(countData)) :
  longer object length is not a multiple of shorter object length

My code is:

setwd("xxx")

countData <- read.table("name_output_expression_matrix_full.out.csv")
head(countData)

colData=read.csv("samples_test.csv")
head(colData)

dds <- DESeqDataSetFromMatrix(countData=countData, 
                              colData=colData, 
                              design=~condition)
deseq2 • 2.1k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Check your input. The counts and information about counts (colData) need to line up or else you will get meaningless results.

ADD COMMENT
0
Entering edit mode

Right now nrow(colData) is 6 and ncol(countData) is 7. Is this where the issue is?

ADD REPLY
0
Entering edit mode

You have 7 samples of counts (columns of the matrix) and sample information about 6 samples. What would you expect a software package do at this point?

ADD REPLY
0
Entering edit mode

It's actually one column of gene id (e.g. "ENSG00000090661.11") with six additional columns of six samples.

Is there a way to make it work?

ADD REPLY
0
Entering edit mode

Of course it can work. You need to make the gene names row names, not a regular column of data.

ADD REPLY
0
Entering edit mode

How does one proceed to alter the code to achieve this? Apologies. I am a novice with this.

ADD REPLY
0
Entering edit mode

It should be a two-step operation of: 1) assigning the rownames to the data-frame, and, 2) then removing that column (gene names) from the main data-frame. It would be a good habit to look up how to do it.

ADD REPLY
0
Entering edit mode

I entered the following code:

library( "DESeq2" )
library(ggplot2)
library(dplyr)

setwd("xxx")

countData <- read.table('name_output_expression_matrix_full.out')
countData <- select(countData, -1)
head(countData)

colData=read.csv("samples_test.csv", row.names=1)
head(colData)

dds <- DESeqDataSetFromMatrix(countData=countData, 
                              colData=colData, 
                              design=~condition)

And received the following error message: "Error in round(assay(se)) : non-numeric argument to mathematical function".

Thoughts?

ADD REPLY
0
Entering edit mode

What is the output of str(countData)?

ADD REPLY
0
Entering edit mode

The result of str(countData) is:

'data.frame':   58782 obs. of  6 variables:
 $ V2: Factor w/ 3809 levels "0","1","10","100",..: 3809 3538 1 2942 2 1 1 1553 1228 1 ...
 $ V3: Factor w/ 2861 levels "0","1","10","100",..: 2861 2083 1 1927 2 1 1 311 2267 1 ...
 $ V4: Factor w/ 2266 levels "0","1","10","100",..: 2266 1190 1 664 1 1 1 482 1209 1 ...
 $ V5: Factor w/ 2884 levels "0","1","10","100",..: 2884 2849 1 2157 1 1 1 637 2 1 ...
 $ V6: Factor w/ 3779 levels "0","1","10","100",..: 3779 3625 1 200 2439 1 1 3538 3460 1 ...
 $ V7: Factor w/ 3338 levels "0","1","10","100",..: 3338 194 1 2948 2441 1 1 113 3 1 ...
ADD REPLY
0
Entering edit mode

So, there's the problem. Somewhere in your workflow, your data, which should be numeric, was converted to factors. You'll have to go back to re-trace your steps.

ADD REPLY
0
Entering edit mode

Right now nrow(colData) is 6 and ncol(countData) is 7. Is this where the issue is?

ADD REPLY

Login before adding your answer.

Traffic: 580 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6