Hey guys, I need some help regarding using edgeR in CRD w/ factorial arrangement experiments. The main problem here is to fit the group/design into the linear model.
To briefly present the trial design:
Main factors: 3 -> Bird type (CM or S), Age (01 or 14), vaccination status (NO or YES). That gave me a 2 x 2 x 2 factorial arrangement.
Objective: Compare the differences in gene expression in each organ (HG or TC) within each timepoint (24 or 48). For that, analyze main factors interactions or if they act together on each gene in each organ/timepoint.
edgeR approach used: Experiments with all combinations of multiple factors
The groups:
[1] CM.14.NO CM.14.NO CM.14.NO CM.14.YES CM.14.YES CM.14.YES CM.14.YES CM.01.NO CM.01.NO
[10] CM.01.NO CM.01.YES CM.01.YES CM.01.YES CM.01.YES S.14.NO S.14.NO S.14.NO S.14.NO
[19] S.14.YES S.14.YES S.14.YES S.14.YES S.01.NO S.01.NO S.01.NO S.01.YES S.01.YES
[28] S.01.YES S.01.YES S.01.YES
Levels: CM.01.NO CM.01.YES CM.14.NO CM.14.YES S.01.NO S.01.YES S.14.NO S.14.YES
The design matrix:
[1] "(Intercept)", "temp.samples$TYPES"
[3] "temp.samples$AGE14", "temp.samples$VACCYES"
[5] "temp.samples$TYPES_temp.samples$AGE14" "temp.samples$TYPES_temp.samples$VACCYES"
[7] "temp.samples$AGE14_temp.samples$VACCYES" "temp.samples$TYPES_temp.samples$AGE14_temp.samples$VACCYES"
Code:
group <- factor(paste(temp.samples$TYPE, temp.samples$AGE, temp.samples$VACC, sep = "." ))
design <- model.matrix(~ temp.samples$TYPE*temp.samples$AGE*temp.samples$VACC, data=temp.samples)
colnames(design) <- gsub(":", "_", colnames(design))
# Create DGEList and fit it in the model
temp.y <- DGEList(counts = temp.data, group = group)
temp.keep <- filterByExpr(temp.y) #filter out low expressed genes
temp.y <- temp.y[temp.keep, , keep.lib.sizes = FALSE]
temp.y <- calcNormFactors(temp.y) #normalize
temp.y <- estimateDisp(temp.y, design, robust = TRUE)
temp.y <- estimateGLMCommonDisp(temp.y, design = design)
temp.y <- estimateGLMTrendedDisp(temp.y, design = design)
temp.y <- estimateGLMTagwiseDisp(temp.y, design = design)
fit <- voomLmFit(temp.y, design, temp.samples)
colnames(fit)
# Get all levels of the factor
levels <- levels(temp.y$samples$group)
# Generate all combinations of levels
combinations <- combn(levels, 2)
# Create contrasts for all combinations
contrasts <- apply(combinations, 2, function(x) {
contrast <- paste(x[1], "-", x[2])
if (all(x %in% colnames(design))) {
# Create contrast matrix
contr.matrix <- makeContrasts(contrasts = contrast, levels = design)
} else {
message(paste("Skipping contrast:", contrast, "because not all levels exist in the design matrix."))
contr.matrix <- NULL
}
contr.matrix
})
#Apply F-Test
lrt <- glmLRT(fit, coef= 8)
> sessionInfo()
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22621)
Matrix products: default
locale:
[1] LC_COLLATE=Portuguese_Brazil.utf8 LC_CTYPE=Portuguese_Brazil.utf8 LC_MONETARY=Portuguese_Brazil.utf8
[4] LC_NUMERIC=C LC_TIME=Portuguese_Brazil.utf8
time zone: America/Chicago
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] UpSetR_1.4.0 eulerr_7.0.0 ggplot2_3.4.4 dplyr_1.1.3 tibble_3.2.1 org.Gg.eg.db_3.18.0
[7] AnnotationDbi_1.64.0 IRanges_2.36.0 S4Vectors_0.40.0 Biobase_2.62.0 BiocGenerics_0.48.0 edgeR_4.0.0
[13] limma_3.58.0
loaded via a namespace (and not attached):
[1] utf8_1.2.4 generics_0.1.3 bitops_1.0-7 RSQLite_2.3.1 lattice_0.21-8
[6] magrittr_2.0.3 evaluate_0.23 grid_4.3.1 fastmap_1.1.1 blob_1.2.4
[11] plyr_1.8.9 GenomeInfoDb_1.38.5 DBI_1.2.1 gridExtra_2.3 httr_1.4.7
[16] fansi_1.0.5 scales_1.3.0 Biostrings_2.70.0 cli_3.6.1 rlang_1.1.1
[21] crayon_1.5.2 XVector_0.42.0 splines_4.3.1 munsell_0.5.0 bit64_4.0.5
[26] withr_3.0.0 cachem_1.0.8 tools_4.3.1 memoise_2.0.1 colorspace_2.1-0
[31] locfit_1.5-9.8 GenomeInfoDbData_1.2.11 vctrs_0.6.4 R6_2.5.1 png_0.1-8
[36] lifecycle_1.0.4 zlibbioc_1.48.0 KEGGREST_1.42.0 bit_4.0.5 pkgconfig_2.0.3
[41] gtable_0.3.4 pillar_1.9.0 glue_1.6.2 Rcpp_1.0.11 statmod_1.5.0
[46] xfun_0.41 tidyselect_1.2.0 highr_0.10 rstudioapi_0.15.0 knitr_1.45
[51] compiler_4.3.1 RCurl_1.98-1.12
The objective that you have stated
doesn't make sense because organ and timepoint are not factors in your experiment.
What is your actual objective?
Hey Gordon,
Organ and timepoint are not factors. The factors are the treatments (bird type, age, and vaccination status).
the objective is to test the interaction between the three main factors (bird type, age, and vaccination status) on each organ, per time point.