Hello,
I am performing DGE analysis using DESeq2. I have two groups to compare: CTRL and SA, and I have performed a group comparison using DESeq2 and there's no issue with that.
However, I have males and females in each group, and I'm curious to see if there's an interaction between Sex and Group. Below is a snippet of my code attempting to use an interaction term.
My first question is, is my code correct? I need to simultaneously use the sva package and I'm unsure as to how to combine the code for sva and the interaction (see precise question below).
My second question is:  Can I compare the number of DEGs obtained from a design formula that excludes Sex:Group (contrast = c("Group", "SA", "CTRL")) to the number of DEGs obtained from a design formula that includes the Sex:Group interaction (contrast=list( c("Group_SA_vs_CTRL", "SexFemale.GroupSA")))? Can I make inferences about the interaction between sex and group this way?
Side note: I have used some other methods to determine whether there is sexual divergence in gene expression, however, I would like to specifically see what using the interaction term in DESeq2 tells me.
Thanks!
#model for differential expression
DE_model_pre_sva <- formula( ~ pH + Age + PMI + Sex + Group + Sex:Group, data = metadata)
#make objects for DESeq2 analysis
DE_summarized_exp <- SummarizedExperiment::SummarizedExperiment(as.matrix(filtered_rawcounts), colData = metadata)
DE_dataset_pre_sva <- DESeq2::DESeqDataSet(DE_summarized_exp, design = DE_model_pre_sva)
#get median of ratios normalized counts
median_of_ratios_normalised_counts <- DESeq2::counts(DESeq2::estimateSizeFactors(DE_dataset_pre_sva), normalized = TRUE)
write.csv(median_of_ratios_normalised_counts, "median_of_ratios_normcounts.csv")
#obtain surrogate variables
sva_res <- sva::svaseq(median_of_ratios_normalised_counts, mod = model.matrix(~ pH + Age + PMI + Sex + Group + Sex:Group, data = metadata), mod0 = model.matrix(~ pH + Age + PMI, data = metadata))
Should I include Sex and Group in the mod0 while Sex:Group is included in mod? I'm thinking Sex and Group should be excluded in mod0, but I hesitate.
metadata$SV1 <- scale(sva_res$sv[ ,1])
metadata$SV2 <- scale(sva_res$sv[ ,2])
metadata$SV3 <- scale(sva_res$sv[ ,3])
metadata$SV4 <- scale(sva_res$sv[ ,4])
metadata$SV5 <- scale(sva_res$sv[ ,5])
metadata$SV6 <- scale(sva_res$sv[ ,6])
DE_model_with_svs <- formula( ~ SV1 + SV2 + SV3 + SV4 + SV5 + SV6 + pH + Age + PMI + Sex + Group + Sex:Group, data = metadata) # add interaction term for sex differences
DE_summarized_exp_with_svs <- SummarizedExperiment::SummarizedExperiment(as.matrix(filtered_rawcounts), colData = metadata)
DE_dataset_with_svs <- DESeq2::DESeqDataSet(DE_summarized_exp_with_svs, design = DE_model_with_svs)
#run DE analysis
DE_analysis <- DESeq2::DESeq(DE_dataset_with_svs)
resultsNames(DE_analysis) # check which comparisons are possible 
#### Assessing interaction
DE_dataset_with_svs$Group # check which group is the reference, first one is reference
DE_dataset_with_svs$Sex # check which sex is the reference, first one is reference
DE_dataset_with_svs$Sex = relevel(DE_dataset_with_svs$Sex, "Male") # change which sex is the reference
# if looking at males, make sure the reference is females, if looking at females, make sure the reference is males
DE_dataset_with_svs$Sex # check which sex is the reference (selecting males as reference)
DE_results <- DESeq2::results(DE_analysis, alpha = 0.05, independentFiltering = FALSE, contrast=list( c("Group_SA_vs_CTRL", "SexFemale.GroupSA")))
# Number of genes with padj < 0.05 for interaction
sum(DE_results$padj < 0.05, na.rm = TRUE)
