Dear DESeq2 users,
I ask for your advice: I am interested in using DESeq2 to evaluate effects of knockdown on gene expression during the course of differentiation. Below I have outlined my current approach, which is inspired by http://www.bioconductor.org/help/workflows/rnaseqGene/#time-course-experiments and previous question/answers on this forum.
I am still trying to grasp the concept of my analysis, and would very much like to hear your expert opinions regarding 1) whether my approach works as intended, 2) whether there is another and more appropriate approach.
My experimental setup is as follows:
Cells are transfected with RNAi targeting my factor of interest (siGR) or a negative control (siLuc). 2 days post transfection (referred to as MSC time point), cells are induced to undergo either osteoblast(ob) or adipocyte(ad) differentiation. Gene expression is quantified by RNA-seq immediately prior to differentiation (MSC time point) and following 24 hours of differentiation (Day1). Experiment is performed as two biological replicates.
Coldata is as follows:
#Coldata: Replicate Type Cell Timepoint 1 a siLuc Ob Day1 2 b siLuc Ob Day1 3 a siGR Ob Day1 4 b siGR Ob Day1 5 a siLuc Msc Msc 6 b siLuc Msc Msc 7 a siGR Msc Msc 8 b siGR Msc Msc 9 a siLuc Ad Day1 10 b siLuc Ad Day1 11 a siGR Ad Day1 12 b siGR Ad Day1
Code describing my analysis:
#I Assemble the DESeq matrix, and define an interaction term in the design. ddsTC <- DESeqDataSetFromMatrix(countdata_kd, ColData, design=~Type+Timepoint+Type:Timepoint) #I subselect data, in this example looking only at the effect of knockdown on Ob differentiation. ddsTC <- ddsTC[ ,ddsTC$Cell == "Msc" | ddsTC$Cell == "Ob"] ddsTC$Cell <- droplevels(ddsTC$Cell) #I relevel, setting siLuc and MSC as baselevels ddsTC$Type <- relevel(ddsTC$Type, "siLuc") ddsTC$Timepoint <- relevel(ddsTC$Timepoint, "Msc") #I run the analysis ddsTC <- DESeq(ddsTC, test="LRT", reduced = ~ Type + Timepoint) res <- results(ddsTC, name="TypesiLuc.TimepointMsc", test="Wald")
By setting name=“TypesiLuc.TimepointMsc” when calling results, I ask “Are the effects of siGR vs. siLuc significantly different at day 1 compared to Msc?”. In other words, p values are reflecting effects of RNAi and not differentiation. Is this correctly understood?
In my results I see genes clearly affected by knocking my factor, yet the gene has a high padj value. An example is included below, in which siGR treatment represses a gene otherwise induced during differentiation. Why am i not catching this gene in my analysis?
Thank you in advance!
Martin
Sessioninfo
R version 3.2.3 (2015-12-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=Danish_Denmark.1252 LC_CTYPE=Danish_Denmark.1252 LC_MONETARY=Danish_Denmark.1252 [4] LC_NUMERIC=C LC_TIME=Danish_Denmark.1252 attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] ggplot2_2.0.0 DESeq2_1.10.1 RcppArmadillo_0.6.200.2.0 Rcpp_0.12.2 [5] SummarizedExperiment_1.0.1 Biobase_2.30.0 GenomicRanges_1.22.1 GenomeInfoDb_1.6.1 [9] IRanges_2.4.1 S4Vectors_0.8.2 BiocGenerics_0.16.1 loaded via a namespace (and not attached): [1] RColorBrewer_1.1-2 futile.logger_1.4.1 plyr_1.8.3 XVector_0.10.0 futile.options_1.0.0 [6] tools_3.2.3 zlibbioc_1.16.0 digest_0.6.9 rpart_4.1-10 RSQLite_1.0.0 [11] annotate_1.48.0 gtable_0.1.2 lattice_0.20-33 DBI_0.3.1 gridExtra_2.0.0 [16] genefilter_1.52.1 cluster_2.0.3 locfit_1.5-9.1 grid_3.2.3 nnet_7.3-11 [21] AnnotationDbi_1.32.3 XML_3.98-1.3 survival_2.38-3 BiocParallel_1.4.3 foreign_0.8-66 [26] latticeExtra_0.6-28 Formula_1.2-1 geneplotter_1.48.0 lambda.r_1.1.7 Hmisc_3.17-2 [31] scales_0.3.0 splines_3.2.3 xtable_1.8-2 colorspace_1.2-6 labeling_0.3 [36] acepack_1.3-3.3 munsell_0.4.3