Question: DESeq2 - factor base level changes the DE genes.
3.4 years ago by
Guest User • 12k
Guest User • 12k wrote:
Hello, I have an RNAseq experiment from two strains of mice, that have been exposed to two different environments (three biological replicates of each combination). So my experimental setup is > colData genetics environment black2 B6 black black3 B6 black black5 B6 black black1 B6 yellow black4 B6 yellow black6 B6 yellow agouti1 129 yellow agouti4 129 yellow agouti6 129 yellow agouti2 129 black agouti3 129 black agouti5 129 black I want to test which genes change their expression when the environment changes, controlling for the genetic background. I am using DESeq2 for this, with the formula ~ genetics + environment + genetics:environment. However, the choice of the base level for the factors in my experimental design changes the genes that are significant. If I leave the default reference levels (129 for genetics and black for env) I get more DE genes that if I set as reference B6 for genetics and yellow for environemnt. I thought the reference level is used to decide how to calculate the fold change and is necessary to correctly interpret the coefficients, but shouldn't the results be the same? Any thoughts on what am I missing? Thank you very much in advance. My code is below. ## 1 ## > levels(colData$genetics)  "129" "B6" > levels(colData$environment)  "black" "yellow" > dds <- DESeqDataSetFromMatrix(countData = data[,8:19], colData = colData, design = ~ genetics + environment + genetics:environment) > dds <- DESeq(dds) > resG <- results(dds, alpha=0.05, name="genetics_B6_vs_129") > resE <- results(dds, alpha=0.05, name="environment_yellow_vs_black") > resGE <- results(dds, alpha=0.05, name="geneticsB6.environmentyellow") > dim(subset(resG, resG$padj < 0.05))  3675 6 > dim(subset(resE, resE$padj < 0.05))  20 6 > dim(subset(resGE, resGE$padj < 0.05))  24 6 ## 2 ## > colData2 <- colData > colData2$genetics <- relevel(colData2$genetics, ref="B6") > colData2$environment <- relevel(colData2$environment, ref="yellow") > levels(colData2$genetics)  "B6" "129" > levels(colData2$environment)  "yellow" "black" > dds2 <- DESeqDataSetFromMatrix(countData = data[,8:19], colData = colData2, design = ~ genetics + environment + genetics:environment) > dds2 <- DESeq(dds2) > res2G <- results(dds2, alpha=0.05, name="genetics_129_vs_B6") > res2E <- results(dds2, alpha=0.05, name="environment_black_vs_yellow") > res2GE <- results(dds2, alpha=0.05, name="genetics129.environmentblack") > dim(subset(res2G, resG$padj < 0.05))  3127 6 > dim(subset(res2E, resE$padj < 0.05))  2 6 > dim(subset(res2GE, resGE$padj < 0.05))  0 6 ## > g1 <- subset(resG, resG$padj < 0.05) > g2 <- subset(res2G, res2G$padj < 0.05) > length(intersect(rownames(g1), rownames(g2)))  2640 -- output of sessionInfo(): R version 3.0.2 (2013-09-25) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale:  en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages:  parallel stats graphics grDevices utils datasets methods base other attached packages:  DESeq2_1.2.10 RcppArmadillo_0.4.320.0 Rcpp_0.11.2  GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7  BiocGenerics_0.8.0 loaded via a namespace (and not attached):  annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 DBI_0.2-7  genefilter_1.44.0 grid_3.0.2 lattice_0.20-23 locfit_1.5-9.1  RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.2 stats4_3.0.2  survival_2.37-7 tools_3.0.2 XML_3.95-0.2 xtable_1.7-3 -- Sent via the guest posting facility at bioconductor.org.
ADD COMMENT • link •