Dear Friends,
I needed to run two different models with different technical variables as I got an error when I ran them together (Slide and Plate).
Example:
mod1 <- model.matrix(~ AP1 + Age + factor(Gender) + factor(smoker) + factor(Slide), data = targets)
mod0 <- model.matrix(~ Age + factor(Gender) + factor(smoker) + factor(Slide), data = targets)
n.sv <- num.sv(data, mod1, method = "be") #SVA = 17
mod2 <- model.matrix(~ AP1 + Age + factor(Gender) + factor(smoker) + factor(Plate) + Array, data = targets)
mod02 <- model.matrix(~ Age + factor(Gender) + factor(smoker) + factor(Plate) + Array, data = targets)
n.sv <- num.sv(data, mod2, method = "be") #SVA = 8
I then ran ComBat twice:
modcombat <- model.matrix(~1, data = targets)
adj1 <- ComBat(dat = data, batch = targets$Slide, mod = NULL)
adj2 <- ComBat(dat = adj1, batch = targets$Plate, mod = NULL)
I checked the surrogate variables after adjustment, removing Slide and Plate:
mod3 <- model.matrix(~ AP1 + Age + factor(Gender) + factor(smoker) + Array, data = targets)
mod03 <- model.matrix(Age + factor(Gender) + factor(smoker) + Array, data = targets)
n.sv <- num.sv(adj2, mod3, method = "be") #SVA = 57
I am concerned about the 57 SVA after running ComBat twice. Are there any ideas or suggestions how I might resolve this problem? I should mention that my sample size is n = 160. I am wondering if the problem with increased number of hidden variables comes from the fact that some of the slides only have one case, while other slides have three cases. One of my colleagues ran the same model with no problem however the sample size was n = 850.
Thank you for your help.
Jonelle Villar, RN
> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] sva_3.26.0 BiocParallel_1.12.0 genefilter_1.60.0 mgcv_1.8-23 nlme_3.1-131.1
[6] forcats_0.3.0 stringr_1.3.0 dplyr_0.7.4 purrr_0.2.4 readr_1.1.1
[11] tidyr_0.8.0 tibble_1.4.2 ggplot2_2.2.1 tidyverse_1.2.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.16 lubridate_1.7.2 lattice_0.20-35 digest_0.6.15 assertthat_0.2.0
[6] psych_1.8.3.3 R6_2.2.2 cellranger_1.1.0 plyr_1.8.4 stats4_3.4.4
[11] RSQLite_2.1.0 httr_1.3.1 pillar_1.2.1 rlang_0.2.0 lazyeval_0.2.1
[16] readxl_1.0.0 rstudioapi_0.7 annotate_1.56.2 blob_1.1.1 S4Vectors_0.16.0
[21] Matrix_1.2-13 splines_3.4.4 foreign_0.8-69 RCurl_1.95-4.10 bit_1.1-12
[26] munsell_0.4.3 broom_0.4.4 compiler_3.4.4 modelr_0.1.1 pkgconfig_2.0.1
[31] BiocGenerics_0.24.0 mnormt_1.5-5 matrixStats_0.53.1 IRanges_2.12.0 XML_3.98-1.10
[36] crayon_1.3.4 bitops_1.0-6 grid_3.4.4 jsonlite_1.5 xtable_1.8-2
[41] gtable_0.2.0 DBI_0.8 magrittr_1.5 scales_0.5.0 cli_1.0.0
[46] stringi_1.1.7 reshape2_1.4.3 bindrcpp_0.2.2 limma_3.34.9 xml2_1.2.0
[51] tools_3.4.4 bit64_0.9-7 Biobase_2.38.0 glue_1.2.0 hms_0.4.2
[56] survival_2.41-3 parallel_3.4.4 yaml_2.1.18 AnnotationDbi_1.40.0 colorspace_1.3-2
[61] rvest_0.3.2 memoise_1.1.0 bindr_0.1.1 haven_1.1.1