Question: DESeq2 replicates management
0
14 months ago by
anaQ0
anaQ0 wrote:

I'm looking for the differential gene expression in a somatic embryogenesis process. The thing is that i need to compare 5 conditions (stages) versus "control", each condition with two replicates. My question is... can i compare all in one run or do i need to compare by pares? And... how can i manage the replicates?

I started creating a count matrix (with data from the htseq count) comparing one condition versus control (with the 2 replicates of each one), and this command line:

colData = data.frame(condition = factor(c( "control", "control", "9days", "9days")))
dds <- DESeqDataSetFromMatrix(countData=countsTable, colData = colData, design=~condition))

and i get the results in 4 columns (because of the replicates", but i dont know if this is ok. i was expecting to get only two: control and 9days.

and also... i can't asign the first column which contains the names of the genes.

hope someone can help me

deseq2 R bioconductor • 410 views
modified 14 months ago by Michael Love26k • written 14 months ago by anaQ0
1
14 months ago by
Michael Love26k
United States
Michael Love26k wrote:

You would use a design, ~condition, where that variable is a factor with the five stages and control as levels. Control should be the reference level (see vignette). Then you only need to run DESeq() once and you can compare levels with the results() function and the contrast argument. The replicates are handled automatically. The gene names will propagate as row names if you put them as row names on the counts.

Hi, i've been trying all the ways but i'm not sure if my command line is ok. Could you take a look? Thanks in advance

I'd like to compare all stages (9d, 0d, 1d, 2d, 21d) versus "Control", separatelly.

#Load matrix
> countsTable <- read.csv("htseqcount.csv", header=T, sep=",", row.names="Genes")
#Set column and row names

> colData <- data.frame(condition=factor(c(rep("Control",2), rep("9d",2), rep("0d",2), rep("1d",2), rep("2d",2), rep("21d",2))))
#Specify control

> colData$condition <- relevel(colData$condition, ref="Control")
#Run DESeq

> dds <- DESeqDataSetFromMatrix(countData=countsTable, colData = colData, design=~condition)
> dds <- DESeq(dds)
> res = results(dds)
#Contrast results

> control_vs_9d <- results(dds, contrast=c("condition", "Control","9d"))
> control_vs_0d <- results(dds, contrast=c("condition", "Control","0d"))
> control_vs_1d <- results(dds, contrast=c("condition", "Control","1d"))
> control_vs_2d <- results(dds, contrast=c("condition", "Control","2d"))
> control_vs_21d <- results(dds, contrast=c("condition", "Control","21d"))
#Obtain only differentially expressed genes

> DEG_control_vs_9d <- subset(control_vs_9d, padj < 0.05)
> DEG_control_vs_0d <- subset(control_vs_0d, padj < 0.05)
> DEG_control_vs_1d <- subset(control_vs_1d, padj < 0.05)
> DEG_control_vs_1d <- subset(control_vs_2d, padj < 0.05)
> DEG_control_vs_21d <- subset(control_vs_21d, padj < 0.05)

That's correct. If you print

> control_vs_9d

It will also tell you the contrast you performed at the top of the table.

thank you so much!

Dear Michael, i hace another question...

in the results table, the column "baseMean" refers to the normalized data obtained from the input matrix? Or how can i get this values just with the normalization?

You can call mcols() on many objects in DESeq2 to find more information.

mcols(mcols(dds))

mcols(res)

We also describe the columns here:

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#access-to-all-calculated-values