To the developers,
I've been working on differential expression analysis of drought-tolerance in rice. I have 2 genotypes (tolerant and susceptible) and 2 conditions (drought and well-watered) with 4 replications each, essentially a 2x2 factorial experiment with 4 replications.
One of my main objectives is to identify drought-responsive genes. I set-up the codes as follows:
colData <- data.frame(genotype=rep(c("IL","Swarna"),each=8), condition=rep(rep(c("Control","Drought"),each=4),times=2)) rownames(colData) <- colnames(tx.all$counts) dds <- DESeqDataSetFromTximport(tx.all, colData, formula(~genotype+condition+genotype:condition)) colData(dds)$condition<-relevel(colData(dds)$condition, ref = "Control") dds$group<-factor(paste0(dds$genotype, dds$condition)) design(dds) <- ~group dds<-DESeq(dds, betaPrior = TRUE, parallel = TRUE) resultsNames(dds) #[1] "Intercept" "groupILControl" "groupILDrought" "groupSwarnaControl" [5] "groupSwarnaDrought
Is adding the design term similar to adding an interaction term? We want to make use of the interaction term because we still don't know the mechanisms underlying drought in this experiment hence, we use several contrast arguments from resultsNames
We also want to perform ANOVA so we can identify genes to be used for WGCNA analysis. Does using the codes below is acceptable enough to address this?
dds.ANOVA<-DESeq(dds, test="LRT", reduced = ~1, parallel = TRUE) res.anova <- results(dds.ANOVA) sum(res.anova$padj < 0.05, na.rm = T )
Please advise.
Sincerely,
Asher
R version 3.3.3 (2017-03-06) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] hexbin_1.27.1 vsn_3.42.3 readr_1.1.1 [4] tximport_1.2.0 genefilter_1.56.0 pheatmap_1.0.8 [7] RColorBrewer_1.1-2 ggplot2_2.2.1 gplots_3.0.1 [10] DESeq2_1.14.1 SummarizedExperiment_1.4.0 Biobase_2.34.0 [13] GenomicRanges_1.26.4 GenomeInfoDb_1.10.3 IRanges_2.8.2 [16] S4Vectors_0.12.2 BiocGenerics_0.20.0
Hi Michael,
Since we have two genotypes, one is very drought tolerant (IL) and the other is very drought susceptible (Swarna). The drought-tolerant genotype is a cross between an inbred line (tolerant) and the other genotype in the study (susceptible).
There's a major QTL that is introgressed into the tolerant genotype and this QTL makes the tolerant genotype obviously tolerant to drought.
Now we want to identify genes that are uniquely upregulated and downregulated in both genotypes under drought. Specifically, we want to identify unique genes in the QTL region of the very drought tolerant genotype that are upregulated and downregulated. We then want to use these genes for functional validation through CRISPR-Cpf1.
One of the major hypothesis that we want to test is that there are differentially expressed genes (DEGs) between the two genotypes in the QTL region under drought and we want to identify them.Specifically, we want to identify DEGs under drought in the tolerant genotype. Since we don't know the mechanisms of drought tolerance at the reproductive-stage, we set-up contrast to identify DEGs between several groups.
We also want to establish the difference between the tolerant and the susceptible genotype under drought on the whole genome level.That is, what are the DEGs especially under drought and the ontologies/pathways that are associated with this group of DEGs.
Sincerely,
Asher
It sounds like you could accomplish all of this with pairwise comparisons using results() and contrast with the group variable as you have it above. This will give you Wald test statistics and adjusted p values, which can be used by downstream software packages.
Thank you as always Michael.
I would also like to ask if the following codes will address testing for overall difference across the groups like oneway-ANOVA?
We want to select genes for WGCNA analysis and filtering genes by a differential expression is not recommended. It is recommended that probesets or genes may be filtered by mean expression or variance.
Sincerely,
Asher
That will give you something like an ANOVA. But it is also a test of differential expression.
Thanks, Michael.
So is there any other way to do ANOVA after transformation? Or the codes above would be the one that DESEq2 users can use?
Or is there any other way that DESEq2 can employ filtering by mean expression or variance after transformation?
Please advise.
Sincerely,
Asher
Sorry these questions don’t really make sense and I can’t answer them meaningfully.
Sorry for the confusion and unclear questions Michael.
Probably, what I'm trying to ask is would the above codes be sufficient enough to run a one-way ANOVA?