**0**wrote:

Dear all,

While trying to obtain specific LFCs, we noticed an inconsistency in the results.

We are performing a RNAseq analysis of an experiment with three fixed factors: treatment (control, CL), genotype (G1, G2, G3) and time (t0, t1, t2).

We are interested in specific comparisons, for instance comparing the treatment: control vs CL. Following the manual, there are two ways to do this.

The first method would be to perform an analysis where you contrast both treatments with the results function:

# First perform the analysis of the DESeqDataSet dds_withTHREEWAYint <- DESeqDataSetFromMatrix(countData = mydata_matrix, colData = colData, design = ~ treatment + genotype + time + treatment:genotype + treatment:time + genotype:time + treatment:genotype:time) dds_withTHREEWAYint_DESeq <- DESeq(dds_withTHREEWAYint) # treatment drought vs control results_for_contrast_CLvcontrol_BH <- results(dds_withTHREEWAYint_DESeq, contrast=c("treatment", "CL","control"), alpha=0.05, pAdjustMethod = "BH") # Head of results head(results_for_contrast_CLvcontrol_BH)

The last line generates the following output:

> head(results_for_contrast_CLvcontrol_BH) log2 fold change (MLE): treatment CL vs control Wald test p-value: treatment CL vs control DataFrame with 6 rows and 6 columns baseMean log2FoldChange lfcSE stat pvalue padj <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> AT1G01010 233.9382 -0.05907449 0.2442976 -0.2418136 0.80892457 0.90003409 AT1G01020 227.5727 -0.23369322 0.1027148 -2.2751670 0.02289591 0.08155662 AT1G01030 116.6140 0.68119565 0.2921068 2.3320089 0.01970022 0.07283691 AT1G01040 1480.6111 -0.14968164 0.1093810 -1.3684431 0.17117343 0.35035306 AT1G01046 0.0000 NA NA NA NA NA AT1G01050 1328.1905 0.10908425 0.1433199 0.7611243 0.44658283 0.64690174

A second method, but maybe less straight forward, would be to use a group design. Here, we only use treatment as group and then compare both treatment groups, which looks like this:

# First perform the analysis of the DESeqDataSet dds_withTHREEWAYint <- DESeqDataSetFromMatrix(countData = mydata_matrix, colData = colData, design = ~ treatment + genotype + time + treatment:genotype + treatment:time + genotype:time + treatment:genotype:time) dds_withTHREEWAYint_forgroupanalysis <- dds_withTHREEWAYint # Combine the factors of interest into a single factor with all combinations of the original factors. dds_withTHREEWAYint_forgroupanalysis$group <- factor(paste0(dds_withTHREEWAYint_forgroupanalysis$treatment)) # Change the design to include just this factor, e.g. ~ group. design(dds_withTHREEWAYint_forgroupanalysis) <- ~ group # With the factors and the new design in place, perform DESeq (now the new group factor and the group design will be used). dds_withTHREEWAYint_forgroupanalysis <- DESeq(dds_withTHREEWAYint_forgroupanalysis) # Using the resultsNames-function, one can print all the new combinations of the fixed factors (here treatment, genotype and time). # These names can be used to create contrasts of choice. resultsNames(dds_withTHREEWAYint_forgroupanalysis) # Extract results res_BH <- results(dds_withTHREEWAYint_forgroupanalysis, contrast=c("group", "CL", "control"), alpha=0.05, pAdjustMethod = "BH") # Head of the results: head(res_BH)

Which generates the following output:

> resultsNames(dds_withTHREEWAYint_forgroupanalysis) [1] "Intercept" "groupCL" "groupcontrol" > res_BH <- results(dds_withTHREEWAYint_forgroupanalysis, contrast=c("group", "CL", "control"), alpha=0.05, pAdjustMethod = "BH") > head(res_BH) log2 fold change (MAP): group CL vs control Wald test p-value: group CL vs control DataFrame with 6 rows and 6 columns baseMean log2FoldChange lfcSE stat pvalue padj <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> AT1G01010 233.9382 0.70431932 0.15242364 4.620801 3.822612e-06 1.174188e-05 AT1G01020 227.5727 -0.08691073 0.03719792 -2.336441 1.946829e-02 3.213914e-02 AT1G01030 116.6140 0.22132906 0.24070875 0.919489 3.578398e-01 4.368425e-01 AT1G01040 1480.6111 -0.34129933 0.06420846 -5.315488 1.063717e-07 4.053913e-07 AT1G01046 0.0000 NA NA NA NA NA AT1G01050 1328.1905 -0.28299095 0.12400197 -2.282149 2.248055e-02 3.666190e-02

What strikes us, is the difference in LFC. The LFCs of the group design were shrunk (indicated by MAP), where the LFCs of the other analysis where not shrunk (indicated by MLE).

Since a recent post by Michael Love (New function lfcShrink() in DESeq2) mentioned that the DESeq function no longer shrinks the LFCs, we didn't understand why MAP was still used.

We also noticed that the p-values were different. We think that this is because the design was and a different method was used. Is it correct to assume this? Based on your experience, which p-value would you advice to use?

Thank you for your advice and time,

Kind regards,

Jonas

Antwerp University