Hi Everyone, I apologize if my question is very basic, I am kind of new to Linear models, though I have been trying to familiarize myself with it lately. I have the following setup : 4 treatments , 3 replicates each in 4 different time points. My aim is to get for each Treatment ( AK,A6,A1 ) comparisons between Timepoints ( 7 vs. 3, 14 vs. 3 and 21 vs. 3). But these comparisons should have a reference level of AN at Timepoint 3. Thus, I want for AK : 7 vs. 3, 14 vs. 3 and 21 vs. 3, and the same for A6 and A1, but all of these with AN 3 as a reference. Following this tutorial : https://www.bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf Section 3.3.1 . I am showing R code for the design .
dd1=read.delim("designmatrix.txt",header=T)
dd1$TimePoint=relevel(as.factor(dd1$TimePoint),ref="3")
dd1$Treatment=relevel(as.factor(dd1$Treatment),ref="AN")
dd1$group=paste0(dd1$TimePoint,dd1$Treatment)
dd1
SampleID TimePoint Treatment Replicate group
1 AN0301 3 AN 1 3AN
2 AN0302 3 AN 2 3AN
3 AN0303 3 AN 3 3AN
4 AN0701 7 AN 1 7AN
5 AN0702 7 AN 2 7AN
6 AN0703 7 AN 3 7AN
7 AN1401 14 AN 1 14AN
8 AN1402 14 AN 2 14AN
9 AN1403 14 AN 3 14AN
10 AN2101 21 AN 1 21AN
11 AN2102 21 AN 2 21AN
12 AN2103 21 AN 3 21AN
13 AK0301 3 AK 1 3AK
14 AK0302 3 AK 2 3AK
15 AK0303 3 AK 3 3AK
16 AK0701 7 AK 1 7AK
17 AK0702 7 AK 2 7AK
18 AK0703 7 AK 3 7AK
19 AK1401 14 AK 1 14AK
20 AK1402 14 AK 2 14AK
21 AK1403 14 AK 3 14AK
22 AK2101 21 AK 1 21AK
23 AK2102 21 AK 2 21AK
24 AK2103 21 AK 3 21AK
25 A60301 3 A6 1 3A6
26 A60302 3 A6 2 3A6
27 A60303 3 A6 3 3A6
28 A60701 7 A6 1 7A6
29 A60702 7 A6 2 7A6
30 A60703 7 A6 3 7A6
31 A61401 14 A6 1 14A6
32 A61402 14 A6 2 14A6
33 A61403 14 A6 3 14A6
34 A62101 21 A6 1 21A6
35 A62102 21 A6 2 21A6
36 A62103 21 A6 3 21A6
37 A10301 3 A1 1 3A1
38 A10302 3 A1 2 3A1
39 A10303 3 A1 3 3A1
40 A10701 7 A1 1 7A1
41 A10702 7 A1 2 7A1
42 A10703 7 A1 3 7A1
43 A11401 14 A1 1 14A1
44 A11402 14 A1 2 14A1
45 A11403 14 A1 3 14A1
46 A12101 21 A1 1 21A1
47 A12102 21 A1 2 21A1
48 A12103 21 A1 3 21A1
mm1=model.matrix(~ 0+group,data=dd1)
colnames(mm1)
[1] "group14A1" "group14A6" "group14AK" "group14AN" "group21A1" "group21A6" "group21AK" "group21AN" "group3A1"
[10] "group3A6" "group3AK" "group3AN" "group7A1" "group7A6" "group7AK" "group7AN"
sessionInfo( )
R version 4.0.0 (2020-04-24)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.2 (Nitrogen)
Matrix products: default
BLAS/LAPACK: /nfs/software/as/el7.2/EasyBuild/CRG/software/OpenBLAS/0.3.9-GCC-9.3.0/lib/libopenblas_sandybridgep-r0.3.9.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] DESeq2_1.30.0 csaw_1.24.3
[3] SummarizedExperiment_1.20.0 Biobase_2.50.0
[5] MatrixGenerics_1.2.1 matrixStats_0.58.0
[7] GenomicRanges_1.42.0 GenomeInfoDb_1.26.2
[9] IRanges_2.24.1 S4Vectors_0.28.1
[11] BiocGenerics_0.36.0 edgeR_3.32.1
[13] limma_3.46.0
loaded via a namespace (and not attached):
[1] httr_1.4.1 bit64_0.9-7 splines_4.0.0
[4] assertthat_0.2.1 askpass_1.1 BiocFileCache_1.14.0
[7] blob_1.2.1 GenomeInfoDbData_1.2.4 Rsamtools_2.6.0
[10] progress_1.2.2 pillar_1.4.3 RSQLite_2.2.0
[13] lattice_0.20-41 glue_1.4.0 digest_0.6.25
[16] RColorBrewer_1.1-2 XVector_0.30.0 colorspace_1.4-1
[19] Matrix_1.2-18 XML_3.99-0.3 pkgconfig_2.0.3
[22] biomaRt_2.46.2 genefilter_1.72.1 zlibbioc_1.36.0
[25] purrr_0.3.4 xtable_1.8-4 scales_1.1.0
[28] BiocParallel_1.24.1 tibble_3.0.1 openssl_1.4.1
[31] annotate_1.68.0 ggplot2_3.3.0 ellipsis_0.3.0
[34] GenomicFeatures_1.42.1 survival_3.1-12 magrittr_1.5
[37] crayon_1.3.4 memoise_1.1.0 xml2_1.3.2
[40] tools_4.0.0 prettyunits_1.1.1 hms_0.5.3
[43] lifecycle_0.2.0 stringr_1.4.0 munsell_0.5.0
[46] locfit_1.5-9.4 DelayedArray_0.16.1 AnnotationDbi_1.52.0
[49] Biostrings_2.58.0 compiler_4.0.0 rlang_0.4.5
[52] grid_4.0.0 RCurl_1.98-1.2 rappdirs_0.3.1
[55] bitops_1.0-6 gtable_0.3.0 DBI_1.1.0
[58] curl_4.3 R6_2.4.1 GenomicAlignments_1.26.0
[61] dplyr_0.8.5 rtracklayer_1.50.0 bit_1.1-15.2
[64] stringi_1.4.6 Rcpp_1.0.4.6 vctrs_0.2.4
[67] geneplotter_1.68.0 dbplyr_1.4.3 tidyselect_1.0.0
Now my idea is to use contrasts like group7AK-group3AK, group14AK-group3AK and so on. My question is : Is this setup correct ? Would it actually show changes with respect to the reference level of group3AN ? Also since these are time points, is it ok to use TimePoint as a factor ?
Thank you very much, Debayan
Hi Gordon, Thank you very much for the clarification. If I created the model matrix with an intercept, i.e. mm1=model.matrix(~ group,data=dd1) , would my comparisons like group7AK-group3AK be with reference to the reference level of group ( lets say group3AN )? I mean the value of the intercept would be the mean of the reference level of the group ? Thanks again, Debayan
There is no such thing as making a comparison relative to a reference level. I think you might be misinterpreting something you've read. The comparison of 7AK vs 3AK is just that, and will be same regardless of whether there is an intercept in the model or not.
Hi Gordon, Thanks, I understand now the confusion I was having. Thanks again, Debayan