This question is based on hypothesis testing using fitted lines and Factors that are not of interest as described by Law et al 2021, page 21-29.
The questions are, (1) "What genes have a similar slope across all 3 populations in response to dose and not influenced by lane?" (2) "What genes have different slopes across all 3 populations in response to dose and not influenced by lane?"
> library(limma)
> class (Data$samples$Pop)
[1] "factor"
> class (Data$samples$lane)
[1] "factor"
> class (Data$samples$dose)
[1] "numeric"
> Design <- model.matrix(~ 0 + Pop + Pop:dose + lane , data = Data$samples)
> colnames (MHEE.design)
[1] "Pop1" "Pop2" "Pop3" "lane2" "lane3" "lane4" "lane5" "Pop1:dose" "Pop2:dose" "Pop3:dose"
> colnames(Design) <- gsub ("Pop1", "Pop1.Intercept", colnames(Design))
> colnames(Design) <- gsub ("Pop2", "Pop2.Intercept", colnames(Design))
> colnames(Design) <- gsub ("Pop3", "Pop3.Intercept", colnames(Design))
> colnames(Design) <- gsub ("Pop1:dose", "Pop1.Slope", colnames(Design))
> colnames(Design) <- gsub ("Pop2:dose", "Pop2.Slope", colnames(Design))
> colnames(Design) <- gsub ("Pop3:dose", "Pop3.Slope", colnames(Design))
> colnames(Design)
[1] "Pop1.Intercept" "Pop2.Intercept" "Pop3.Intercept" "lane2" "lane3" "lane4" "lane5" "Pop1.Slope" "Pop2.Slope" "Pop3.Slope"
> v <- voom(Data, Design)
> vfit <- lmFit(v, Design)
#"vfit <- contrasts.fit(vfit, contrasts=my.contrasts)" # if needed
> efit <- eBayes(vfit)
DE genes in columns "Pop1.Slope" "Pop2.Slope" "Pop3.Slope" will be only in reference to "lane 1" so the output will need to be evaluated with topTable() or contrasts before it can be interpreted. However, I haven not been able to find examples where 3 factors have been compared (not contrasted) in one limma voom program, especially in fitted line examples with factors that aren't of interest.
To answer question (1), what genes have the same slope across all 3 populations I have tried topTable(). However, these results clearly show us an average slope between the 3 factors. I do not believe that this answer the question.
topTable(efit, coef = c(8:10), number = Inf, adjust.method = "BH")
Another option that used in the limma handbook would be to look at the overlap in the ven diagram and look at DE genes for all 3 populations, however this approach does not block by all lanes and so I don't it can't be used before the lane effects have been accounted for.
vennDiagram(dt[,8:10], circle.col=c("red", "orange", "turquoise"))
de.common <- which(dt[,8]!=0 & dt[,9]!=0 & dt[,10]!=0)
Another approach that is commonly recommended is making contrasts. However contrasts look for differences between 2 factors. I could see how this could be used help answer question (2), genes where populations respond differently by slope, however the model doesn't answer the primary question (1).
And as a side question, if I wished to look for relationships that were logarithmic would this be the correct code for the model.matrix? It looks correct, but I still feel the need to ask.
> Design <- model.matrix(~ 0 + Pop + Pop:log(dose, 2) + lane , data = Data$samples)
I appreciate any help that can be offered.
> sessionInfo( )
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] gplots_3.1.1 ggplot2_3.3.5 RColorBrewer_1.1-2 Glimma_2.2.0 pheatmap_1.0.12 edgeR_3.34.0 limma_3.48.3
loaded via a namespace (and not attached):
[1] MatrixGenerics_1.4.2 Biobase_2.52.0 httr_1.4.2 jsonlite_1.7.2 bit64_4.0.5
[6] splines_4.1.0 gtools_3.9.2 assertthat_0.2.1 BiocManager_1.30.16 stats4_4.1.0
[11] blob_1.2.2 GenomeInfoDbData_1.2.6 pillar_1.6.2 RSQLite_2.2.8 lattice_0.20-44
[16] glue_1.4.2 digest_0.6.27 GenomicRanges_1.44.0 XVector_0.32.0 colorspace_2.0-2
[21] htmltools_0.5.1.1 Matrix_1.3-3 DESeq2_1.32.0 XML_3.99-0.7 pkgconfig_2.0.3
[26] genefilter_1.74.0 zlibbioc_1.38.0 purrr_0.3.4 xtable_1.8-4 scales_1.1.1
[31] BiocParallel_1.26.2 tibble_3.1.3 annotate_1.70.0 KEGGREST_1.32.0 generics_0.1.0
[36] IRanges_2.26.0 ellipsis_0.3.2 withr_2.4.2 cachem_1.0.6 SummarizedExperiment_1.22.0
[41] BiocGenerics_0.38.0 survival_3.2-11 magrittr_2.0.1 crayon_1.4.1 memoise_2.0.0
[46] fansi_0.5.0 tools_4.1.0 lifecycle_1.0.0 matrixStats_0.60.1 S4Vectors_0.30.0
[51] munsell_0.5.0 locfit_1.5-9.4 DelayedArray_0.18.0 AnnotationDbi_1.54.1 Biostrings_2.60.2
[56] compiler_4.1.0 GenomeInfoDb_1.28.1 caTools_1.18.2 rlang_0.4.11 grid_4.1.0
[61] RCurl_1.98-1.4 rstudioapi_0.13 htmlwidgets_1.5.3 bitops_1.0-7 gtable_0.3.0
[66] DBI_1.1.1 R6_2.5.1 dplyr_1.0.7 fastmap_1.1.0 bit_4.0.4
[71] utf8_1.2.2 KernSmooth_2.23-20 parallel_4.1.0 Rcpp_1.0.7 vctrs_0.3.8
[76] geneplotter_1.70.0 png_0.1-7 tidyselect_1.1.1
Thank you for your help Gordon,
Yes, that is the paper. I have found it to be another great source of information.
The information you just posted certainly helps with the second part of the second question of the study; (2) what genes have different slopes between pops when considering lane effects.
If you are willing, we are still interested in help on how to setup the model for first question (1) what genes have the same slope when lane effects are considered.
If 2 models were used, one model to answer each question, then can the design below be used to answer the first question?
[1] "dose" "Pop1" "Pop2" "Pop3" "lane2" "lane3" "lane4" "lane5"
I'l interpret this as all genes that have a significant slope (not equal to 0) for all 3 populations, but it doesn't let me know if the slopes can be considered similar.
Thanks for the warning about gsub() code. I used gsub() in this example for clarity on interpreting the results. Does your warning mean that I am not naming the columns correctly or were you noting the typo in the example's output (which I just edited/removed).