Question

Deseq2 Multivariate Time Series: DEGs for each time point comparison?

0

Entering edit mode

jamesfischer047 • 0

@jamesfischer047-20746

Last seen 5.0 years ago

Hi all,

I have a set of RNA-seq data. It's a time series with 4 stages of leaf development, 2 treatments, and 3 bio-reps per stage x treat combo. I've already found DEGs for the treatments, and DEGs for the stage:treatment interaction, but what I would like to do is find DEGs at each stage comparison (e.g. genes that are specifically differentially expressed at stage 2, etc.). I've tried doing clustering with maSigPro, but it did not really provide the resolution that I need.

Any ideas on this? My code is below.

Thanks in advance!

  reads <- as.matrix(read.csv("genereads_ONLY3.txt", sep = '\t', row.names = 1, header = TRUE)) 
  meta <- read.table("metatest2.txt", header = TRUE)




  mm <- model.matrix(~ stage + treat + stage:treat, data = meta)
    mm2 <- model.matrix(~ stage + treat, data = meta)
      mm3 <- model.matrix(~ stage, data = meta)

    ddsd <- DESeqDataSetFromMatrix(countData = reads,
                                  colData = meta,
                                  design = mm)

    dds <- ddsd[ rowSums(counts(ddsd)) > 10, ]

  #running DESeq

   dds.interaction <- DESeq(dds, test = "LRT", full = mm, reduced = mm2) 
                #Retrieves genes that are DEGs due to interaction of stage and treatment
    dds.interact.results <- results(dds.interaction)
      sum(dds.interact.results$padj < 0.05, na.rm = TRUE)
        sub.interact <- subset(dds.interact.results, padj<0.05)

           dds.interact.DEGs.genes <-  row.names(sub.interact) 
              dds.interact.DEGs.genes.df <-  as.data.frame(dds.interact.DEGs.genes) 
                colnames(dds.interact.DEGs.genes.df) <- "GENES"

    dds.treatonly <-  DESeq(dds, test = "LRT", full = mm, reduced = mm3)
       dds.treat.DEGS <- results(dds.treatonly)
           sum(dds.treat.DEGS$padj < 0.05, na.rm = TRUE)
              sub.treat <- subset(dds.treat.DEGS, padj<0.05)

          dds.treat.DEGs.genes <-  row.names(sub.treat) 
            dds.treat.DEGs.genes.df <- as.data.frame(dds.treat.DEGs.genes)
                colnames(dds.treat.DEGs.genes.df) <- "GENES"

           DIFFERENCE <- setdiff(dds.treat.DEGs.genes.df, dds.interact.DEGs.genes.df)

deseq2 RNA-seq • 985 views

ADD COMMENT • link updated 5.0 years ago by Michael Love 41k • written 5.0 years ago by jamesfischer047 • 0

score 1 · Answer 1 · 2019-05-09

1

Entering edit mode

Michael Love 41k

@mikelove

Last seen 10 hours ago

United States

Have you seen the time series example in the workflow? You can pull out time/stage specific differences usually using results() with name or contrast=list(..., ...).

ADD COMMENT • link 5.0 years ago Michael Love 41k

0

Entering edit mode

Thank you for the answer! Whenever I use resultsNames(), I get the same thing for my different tests (the different tests have different designs, as can be seen above).

I get:

[1] "Intercept"       "stage"           "treatX800"       "stage.treatX800"

In which "X800" represents my treatment.

I know I've seen other outcomes of "results()" online that seem much lengthier. How would I go about designing my 'contrast = " with that output to get a comparison of a single stage between my control and treat?

Here is my full model matrix in case that helps:

  (Intercept) stage treatX800 stage:treatX800
1            1     1         0               0
2            1     1         0               0
3            1     1         0               0
4            1     2         0               0
5            1     2         0               0
6            1     2         0               0
7            1     3         0               0
8            1     3         0               0
9            1     3         0               0
10           1     4         0               0
11           1     4         0               0
12           1     4         0               0
13           1     1         1               1
14           1     1         1               1
15           1     1         1               1
16           1     2         1               2
17           1     2         1               2
18           1     2         1               2
19           1     3         1               3
20           1     3         1               3
21           1     3         1               3
22           1     4         1               4
23           1     4         1               4
24           1     4         1               4

ADD REPLY • link 5.0 years ago jamesfischer047 • 0

1

Entering edit mode

When you run DESeqDataSetFrom...() it should be printing a message to the console that you have encoded the variables as numeric, and that you may have intended them to be factors instead. You should heed that message.

(Edit: the message comes upon dataset construction, not DESeq())

ADD REPLY • link 5.0 years ago Michael Love 41k

0

Entering edit mode

Thank you! For some reason, that message never popped up. But that was in fact the issue!

ADD REPLY • link 5.0 years ago jamesfischer047 • 0

0

Entering edit mode

Here is what happens when you create a dds with an integer-valued numeric in the design:

> DESeqDataSetFromMatrix(counts(dds), colData(dds), ~condition)
  the design formula contains a numeric variable with integer values,
  specifying a model with increasing fold change for higher values.
  did you mean for this to be a factor? if so, first convert
  this variable to a factor using the factor() function

What version of DESeq2 are you using?

ADD REPLY • link 5.0 years ago Michael Love 41k