Entering edit mode
huck.thornton
▴
10
@huckthornton-23681
Last seen 4.4 years ago
Hi all,
I am trying to run DESEQ on a merged counts file generated by HTSEQ.
I am receiving the follow error message:
Error in validObject(.Object) :
invalid class SummarizedExperiment object: 'colData' nrow differs from 'assays' ncol
In addition: Warning message:
In sort(rownames(colData)) == sort(colnames(countData)) :
longer object length is not a multiple of shorter object length
My code is:
setwd("xxx")
countData <- read.table("name_output_expression_matrix_full.out.csv")
head(countData)
colData=read.csv("samples_test.csv")
head(colData)
dds <- DESeqDataSetFromMatrix(countData=countData,
colData=colData,
design=~condition)
Right now nrow(colData) is 6 and ncol(countData) is 7. Is this where the issue is?
You have 7 samples of counts (columns of the matrix) and sample information about 6 samples. What would you expect a software package do at this point?
It's actually one column of gene id (e.g. "ENSG00000090661.11") with six additional columns of six samples.
Is there a way to make it work?
Of course it can work. You need to make the gene names row names, not a regular column of data.
How does one proceed to alter the code to achieve this? Apologies. I am a novice with this.
It should be a two-step operation of: 1) assigning the rownames to the data-frame, and, 2) then removing that column (gene names) from the main data-frame. It would be a good habit to look up how to do it.
I entered the following code:
And received the following error message: "Error in round(assay(se)) : non-numeric argument to mathematical function".
Thoughts?
What is the output of
str(countData)
?The result of str(countData) is:
So, there's the problem. Somewhere in your workflow, your data, which should be numeric, was converted to factors. You'll have to go back to re-trace your steps.
Right now nrow(colData) is 6 and ncol(countData) is 7. Is this where the issue is?