Hello,
I apologize that similar questions were asked many times before, yet I'm asking again. My design is a 3 x 2 design with a batch effect ( samples were not processed on the same day). I came across this post and am going to follow Dr. Smyth's suggestion (0 ~ + batch + group)
. I just wanted to know if my approach of making contrasts is correct.
My meta_data. I combined two factors (genotype
and treat
) into one factor, group
.
sample_name batch treat genotype group
sample1 sample1 B1 untrt WT WT_untrt
sample2 sample2 B1 trt WT WT_trt
sample3 sample3 B1 untrt MR MR_untrt
sample4 sample4 B1 trt MR MR_trt
sample5 sample5 B1 untrt GD GD_untrt
sample6 sample6 B1 trt GD GD_trt
sample7 sample7 B2 untrt WT WT_untrt
sample8 sample8 B2 trt WT WT_trt
sample9 sample9 B2 untrt MR MR_untrt
sample10 sample10 B2 trt MR MR_trt
sample11 sample11 B2 untrt GD GD_untrt
sample12 sample12 B2 trt GD GD_trt
My design matrix
batchB1 batchB2 groupGD_untrt groupMR_trt groupMR_untrt groupWT_trt groupWT_untrt
sample1 1 0 0 0 0 0 1
sample2 1 0 0 0 0 1 0
sample3 1 0 0 0 1 0 0
sample4 1 0 0 1 0 0 0
sample5 1 0 1 0 0 0 0
sample6 1 0 0 0 0 0 0
sample7 0 1 0 0 0 0 1
sample8 0 1 0 0 0 1 0
sample9 0 1 0 0 1 0 0
sample10 0 1 0 1 0 0 0
sample11 0 1 1 0 0 0 0
sample12 0 1 0 0 0 0 0
My contrasts
contrs <- makeContrasts(
trt_vs_untrt_within_WT = groupWT_trt - groupWT_untrt,
GD_vs_WT_within_untrt = groupGD_untrt - groupWT_untrt,
levels=colnames(design)
)
res_trt_within_WT <- glmQLFTest(fit, contrast=contrs[, "trt_vs_untrt_within_WT"])
res_GD_vs_WT_within_untrt = glmQLFTest(fit, contrast=contrs[, "GD_vs_WT_within_untrt"])
Q1. I'd like to know treatment effect within WT while accounting for the batch effect.
Is my res_trt_within_WT
correct?
Q2. genotype effect (GD vs WT) within untreated
Is my res_GD_vs_WT_within_untrt
correct?
I have gone through edgeR user guide as well as A guide to creating design matrices for gene expression experiments, yet I am asking these questions. I'm sorry...
unsolicited info:
I did PCA and observed the batch effect - PC1 separated batch 1 samples and batch 2 samples with 70% variance explained.
Thank you. I think I found why the order matters.
Yes, that is a quote from A guide to creating design matrices for gene expression experiments.