I am new to DESeq2, but did this together with a colleague that has done some transcriptomics analyses before. However, neither of us could figure out why the output of results is: DataFrame with 14235 rows and 6 columns while our .csv file (imported into excel) shows 31415 rows and 6+1 columns (this last one is obviously because the gene names are now an extra column).
Can anyone tell us why we have so many more rows suddenly? The code we used is below.
#read in counts table with gene names as rownames read.table("mt_mapped_paired_readcounts.tsv.txt", sep= '\t', header = FALSE, row.names = 1) -> counts #filter out rows with only zeros counts.nozero <- counts[rowSums(counts) != 0,] dim(counts.nozero) #removing the last row, which contains NA counts.nozero.nona <- counts.nozero[1:14235,] #file to explain which column is which read.table("columndata", sep= ',', header = TRUE) -> columndata library(DESeq2) dds <- DESeqDataSetFromMatrix(countData = counts.nozero.nona, colData = columndata, design = ~ cables) dds<- DESeq(dds) res <- results(dds, name="cables_yes_cables_vs_no_cables") res #res output log2 fold change (MLE): cables yes cables vs no cables Wald test p-value: cables yes cables vs no cables DataFrame with 14235 rows and 6 columns write.csv(as.data.frame(res), file="deseq2results.csv")