Dear all, my time series experiment; I exposed my plant to light and harvest after 6 time points (0, 0.5, 2, 4. 8, 10h), therefore, I have the effect of the conditions (light), the Time, and the interaction (light+time). I used SALMON and then Tximport. my question is what is the correct command line to import my txi data file as a deseq2 dataset in a time series experiment. Is the following one correct or not???
I need to import my txi data file for deseq2, because I need to do quality assessment and latter deferential gene analyses. I use the following command
dds <- DESeqDataSetFromTximport(txi, sampleTable, ~Condition)
and it works well.
But because I did a time series experiment.
I tried the following to command lines,
dds <- DESeqDataSetFromTximport(txi, sampleTable, ~Condition + Time)
dds <- DESeqDataSetFromTximport(txi, sampleTable, ~Condition + Time + Condition:Time)
but they did not work. So, my question is should I proceed with the first command or what is the right way?/?
Here my command lines that works (but it is one factor "the condition")
# see your files with .sf
>files <- list.files(pattern = ".sf")
>files
#read the tx2gene.csv files
>tx2gene <- read.csv("Sb_tx2gene.csv",head=TRUE)
>head(tx2gene)
>txi <- tximport(files, type = "salmon", tx2gene = tx2gene)
>samplename <- c("0.0h_1", "0.0h_2", "0.0h_3", "0.5h_1", "0.5h_2", "0.5h_3", "12h_1", "12h_2",
"12h_3", "2.0h_1", "2.0h_2", "2.0h_3", "4.0h_1", "4.0h_2", "4.0h_3", "6.0h_1",
"6.0h_2", "6.0h_3")
>counts<- txi$counts
>colnames(counts)<- samplename
>head(counts)
>sampleTable <- data.frame(Condition = c(rep("0.0h",3), rep("0.5h", 3), rep("2.0h", 3), rep("4.0h", 3),
rep("6.0h", 3), rep("12h", 3)))
# I will creat a column called Time to use it later with Condition for time series statics#
>sampleTable$Time <- rep(c("0", "0.5", "2", "4", "6", "12"), each = 3)
# it is a vector, I need to convert as a factor
>sampleTable$Time <- as.factor(sampleTable$Time)
# import txi as a deseq2 dataset
>dds <- DESeqDataSetFromTximport(txi, sampleTable, ~ Condition)
thanks
sorry Micheal, it gives me this message when I run the following code
> dds <- DESeqDataSetFromTximport(txi, sampleTable,formula(~Condition+Time+Condition:Time))
Error in checkFullRank(modelMatrix) :
the model matrix is not full rank, so the model cannot be fit as specified.
Levels or combinations of levels without any samples have resulted in
column(s) of zeros in the model matrix.
Please read the vignette section 'Model matrix not full rank':
vignette('DESeq2')
Here all codes I run on R
> files <- list.files(pattern = ".sf")
> tx2gene <- read.csv("Sb_tx2gene.csv",head=TRUE)
> txi <- tximport(files, type = "salmon", tx2gene = tx2gene)
> names(txi)
[1] "abundance" "counts" "length" "countsFromAbundance"
> samplename <- c("0.0h_1", "0.0h_2", "0.0h_3", "0.5h_1", "0.5h_2", "0.5h_3", "12h_1", "12h_2",
+ "12h_3", "2.0h_1", "2.0h_2", "2.0h_3", "4.0h_1", "4.0h_2", "4.0h_3", "6.0h_1",
+ "6.0h_2", "6.0h_3")
> counts<- txi$counts
> colnames(counts)<- samplename
> counts <- round(counts)
> # Data exploration using DESeq2 pipeline ##
> sampleTable <- data.frame(Condition = c(rep("0.0h",3), rep("0.5h", 3), rep("2.0h", 3), rep("4.0h", 3),
+ rep("6.0h", 3), rep("12h", 3)))
> sampleTable$Time <- rep(c("0", "0.5", "2", "4", "6", "12"), each = 3)
> sampleTable$Time <- as.factor(sampleTable$Time)
> levels(sampleTable$Condition)
[1] "0.0h" "0.5h" "12h" "2.0h" "4.0h" "6.0h"
> sampleTable$sample <- colnames(counts)
> rownames(sampleTable) <- colnames(counts)
> sampleTable
Condition Time sample
0.0h_1 0.0h 0 0.0h_1
0.0h_2 0.0h 0 0.0h_2
0.0h_3 0.0h 0 0.0h_3
0.5h_1 0.5h 0.5 0.5h_1
0.5h_2 0.5h 0.5 0.5h_2
0.5h_3 0.5h 0.5 0.5h_3
2.0h_1 2.0h 2 2.0h_1
2.0h_2 2.0h 2 2.0h_2
2.0h_3 2.0h 2 2.0h_3
4.0h_1 4.0h 4 4.0h_1
4.0h_2 4.0h 4 4.0h_2
4.0h_3 4.0h 4 4.0h_3
6.0h_1 6.0h 6 6.0h_1
6.0h_2 6.0h 6 6.0h_2
6.0h_3 6.0h 6 6.0h_3
12h_1 12h 12 12h_1
12h_2 12h 12 12h_2
12h_3 12h 12 12h_3
> str(sampleTable)
'data.frame': 18 obs. of 3 variables:
$ Condition: Factor w/ 6 levels "0.0h","0.5h",..: 1 1 1 2 2 2 4 4 4 5 ...
$ Time : Factor w/ 6 levels "0","0.5","12",..: 1 1 1 2 2 2 4 4 4 5 ...
$ sample : Factor w/ 18 levels "0.0h_1","0.0h_2",..: 1 2 3 4 5 6 10 11 12 13 ...
> dds <- DESeqDataSetFromTximport(txi, sampleTable, ~ Condition)
using counts and average transcript lengths from tximport
> dds <- DESeqDataSetFromTximport(txi, sampleTable,formula(~Condition+Time+Condition:Time))
Error in checkFullRank(modelMatrix) :
the model matrix is not full rank, so the model cannot be fit as specified.
Levels or combinations of levels without any samples have resulted in
column(s) of zeros in the model matrix.
Please read the vignette section 'Model matrix not full rank':
vignette('DESeq2')
thanks in advance
That vignette section talks about how variables in the design cannot contain redundant information.
Why is your condition in hours, and not "light"?