My input is from the HISAT2 Stringtie protocol extracted with the provided python script.
My code in R is: countData <- as.matrix(read.csv("genecountmatrix.csv"), row.names = "gene_id")
make the colData
colData <- read.csv("pheno_data.csv", sep=",", row.names=1)
check if the colnames are included in the rownames
all(rownames(colData) %in% colnames(countData))
countData <- countData[, rownames(colData)] all(rownames(colData) == colnames(countData)) mode(countData) = "integer"
## create a DDS object dds <-DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~ condition) dds
The results I got was: class: DESeqDataSet dim: 46078 6 metadata(1): version assays(1): counts rownames: NULL rowData names(0): colnames(6): X204L005rep1001 X204L005rep2001 ... emptyL004rep2001 emptyL004rep3001 colData names(1): condition
Everything else is expected with the exception of rownames. In my .csv file, there are gene names, and the DESeq2 manual also has rownames for the outout. I was wondering if the rownames parameter matters for downstream analysis? If I want to do gene expression analysis, I think I need those gene references.