Hello there.
I would like to identify the cell line-independent differentially expressed genes in a time series experiment, i.e. those genes that respond similarly to the stimulus regardless of cell line background. This is the opposite scenario from the one described in the DESeq2 documentation for the fission dataset and I am unsure how to proceed. Please help!
I have three cell lines (X9342, X10742, X10743), a five point time course (0h, 4h, 8h, 16h, 24h), and three biological replicates (T1, T2, T4) for each (total 45 samples). The code is as follows:
#Display coldata
> coldata
cell_line time_point
X9342_0h_T1 X9342 t0
X10743_0h_T2 X10743 t0
X10743_0h_T4 X10743 t0
...
X10742_24h_T1 X10742 t24
X10742_24h_T2 X10742 t24
X10742_24h_T4 X10742 t24
#Run DESeq with the interaction term
> dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~cell_line + time_point + cell_line:time_point)
dds <- DESeq(dds)
#Define a contrast and write an output
> res <- results(dds, contrast=c("time_point", "t0", "t24"))
> res <- res[order(res$padj),]
resdata <- merge(as.data.frame(res), as.data.frame(counts(dds, normalized=T)), by="row.names", sort=F)
write.table(resdata, file="filename.txt", sep="\t", quote=F, row.names=F)
By inspection of the DEG output, this seems a bit off as some of the genes given low p-adj with large LFC don't show similar behavior when I look at the listed values in the accompanying matrix.
I suspect I have to do something similar as the fission example just with different terms. Please advise.
#fission example from DESeq2 workflow
ddsTC <- DESeq(ddsTC, test="LRT", reduced = ~ strain + minute)
resTC <- results(ddsTC)
resTC$symbol <- mcols(ddsTC)$symbol
head(resTC[order(resTC$padj),], 4)
#sessionInfo just in case
> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.6
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] DESeq2_1.14.1 SummarizedExperiment_1.4.0 Biobase_2.34.0 GenomicRanges_1.26.4
[5] GenomeInfoDb_1.10.3 IRanges_2.8.2 S4Vectors_0.12.2 BiocGenerics_0.20.0
loaded via a namespace (and not attached):
[1] genefilter_1.56.0 locfit_1.5-9.1 splines_3.3.3 lattice_0.20-38 colorspace_1.4-1
[6] htmltools_0.3.6 base64enc_0.1-3 survival_2.44-1.1 XML_3.98-1.20 foreign_0.8-71
[11] DBI_1.0.0 BiocParallel_1.8.1 RColorBrewer_1.1-2 plyr_1.8.4 stringr_1.4.0
[16] zlibbioc_1.20.0 munsell_0.5.0 gtable_0.3.0 htmlwidgets_1.3 memoise_1.1.0
[21] latticeExtra_0.6-28 knitr_1.23 geneplotter_1.52.0 AnnotationDbi_1.36.2 htmlTable_1.13.1
[26] Rcpp_1.0.2 acepack_1.4.1 xtable_1.8-4 scales_1.0.0 backports_1.1.4
[31] checkmate_1.9.4 Hmisc_4.1-1 annotate_1.52.1 XVector_0.14.1 gridExtra_2.3
[36] ggplot2_3.2.0 digest_0.6.20 stringi_1.4.3 grid_3.3.3 tools_3.3.3
[41] bitops_1.0-6 magrittr_1.5 lazyeval_0.2.2 RCurl_1.95-4.12 tibble_2.1.3
[46] RSQLite_2.1.2 Formula_1.2-3 cluster_2.0.6 Matrix_1.2-12 data.table_1.12.2
[51] assertthat_0.2.1 rpart_4.1-15 nnet_7.3-12
This appears to have worked! Thanks so much. Code follows (please double check) in case anyone else has this issue.
That’s fine, except the first DESeq() is not necessary, should be removed.
Thanks again, Michael. Posting correction here for posterity: