I'm very new to using R, and I'm trying to use it to perform an analysis of RNA-seq data, to look at differential expression. I've done a few online tutorials to try and build up some background, but I'm hitting a roadblock.
Basically, when I run my script, I get the following error:
Error in DESeqDataSetFromMatrix(countData = countData, colData = colData, : ncol(countData) == nrow(colData) is not TRUE
I realise that this error means that my colData and countData don't quite match up. I performed the following, which showed that I have 7 columns in my countData and 6 rows in my colData file:
However, I cannot see how to alter the colData file to make it match up. Or how to get my countData file to skip the first row containing gene IDs, so that it only counts 6 columns.
My countData.csv file looks like this:
GeneID Control1 Control2 Control3 LPO1 LPO2 LPO3 ENSG01254216542 1.1 1.3 1.14 7.0 7.5 7.2
The colData.csv file looks like this:
Condition type Control1 untreated paired-read Control2 untreated paired-read Control3 untreated paired-read LPO1 treated paired-read LPO2 treated paired-read LPO3 treated paired-read
The script that I'm using is:
library(DESeq2) data <-read.csv("//csce.datastore.ed.ac.uk//Control1.output2.csv") se <- data countData <- as.matrix(se,row.names="Geneid", header = TRUE, sep = '\t', row.names = 1) colData <- read.csv("//csce.datastore.ed.ac.uk//csce//biology//users//s0348375//Win7//Desktop//colData2.csv", row.names=1) colData <- colData[,c("condition","type")] colnames(countData) <- NULL dds <- (countData = countData, colData = colData, design = ~ condition) dds <- dds[ rowSums(counts(dds)) > 1, ] dds<-DESeq(dds)
I was wondering if anyone may be able to help with a solution? I'm sure that it's something very obvious, but any help would be very much appreciated, as I don't have access to bioinformatic support.