Hi,
I've got some blood 450K methylation data and I've used the estimateCellCounts function in Minfi to include as terms in my model for a differential methylation test. I just wanted to be sure that what I'm doing is correct - I want to get adjust beta values for visualisations.
Code:
foo <- estimateCellCounts(rgSet = raw.idat, meanPlot = T, returnAll = T, fixOutliers = T, mergeManifest = T) treatments <- unique(pData(lumi.norm.filtered)$SampleType) treatment_arrays <- as.factor(pData(lumi.norm.filtered)$SampleType) cell_counts <- as.data.frame(foo$counts) design <- model.matrix(~0 + treatment_arrays + cell_counts$CD8T + cell_counts$CD4T + cell_counts$NK + cell_counts$Bcell + cell_counts$Mono + cell_counts$Gran) colnames(design) <- c(treatments, colnames(cell_counts)) fitM <- lmFit(getM(lumi.norm.filtered), design) #Adjust M Values mAdj.in <- getM(lumi.norm.filtered) mAdj.fit <- fitM$coefficients[,-c(1,2)] mAdj <- as.matrix(mAdj.in) - mAdj.fit %*% t(design[,-c(1,2)]) #Use Lumi's M value to Beta Value conversion betaAdj <- m2beta(mAdj) #or #betaAdj <- ilogit2(mAdj)
Does that seem logical/ sane?
> sessionInfo() R version 3.2.1 (2015-06-18) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.2 LTS locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] grid stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] FlowSorted.Blood.450k_1.6.0 RPMM_1.20 cluster_2.0.3 [4] dynamicTreeCut_1.62 impute_1.42.0 IlluminaHumanMethylation450kanno.ilmn12.hg19_0.2.1 [7] IlluminaHumanMethylation450kmanifest_0.4.0 biomaRt_2.24.0 GenomicFeatures_1.20.1 [10] AnnotationDbi_1.30.1 biovizBase_1.16.0 ggbio_1.16.1 [13] cowplot_0.5.0 gridExtra_2.0.0 doParallel_1.0.8 [16] scales_0.2.5 reshape2_1.4.1 ggplot2_1.0.1 [19] limma_3.24.14 lumi_2.20.2 minfi_1.14.0 [22] bumphunter_1.8.0 locfit_1.5-9.1 iterators_1.0.7 [25] foreach_1.4.2 Biostrings_2.36.1 XVector_0.8.0 [28] GenomicRanges_1.20.5 GenomeInfoDb_1.4.1 IRanges_2.2.5 [31] S4Vectors_0.6.2 lattice_0.20-33 Biobase_2.28.0 [34] BiocGenerics_0.14.0 loaded via a namespace (and not attached): [1] nlme_3.1-121 bitops_1.0-6 matrixStats_0.14.2 RColorBrewer_1.1-2 tools_3.2.1 doRNG_1.6 [7] nor1mix_1.2-0 affyio_1.36.0 rpart_4.1-10 KernSmooth_2.23-15 Hmisc_3.16-0 DBI_0.3.1 [13] mgcv_1.8-7 colorspace_1.2-6 nnet_7.3-10 methylumi_2.14.0 GGally_0.5.0 base64_1.1 [19] preprocessCore_1.30.0 graph_1.46.0 pkgmaker_0.22 labeling_0.3 rtracklayer_1.28.6 genefilter_1.50.0 [25] quadprog_1.5-5 affy_1.46.1 RBGL_1.44.0 stringr_1.0.0 digest_0.6.8 Rsamtools_1.20.4 [31] foreign_0.8-65 illuminaio_0.10.0 siggenes_1.42.0 GEOquery_2.34.0 dichromat_2.0-0 BSgenome_1.36.2 [37] RSQLite_1.0.0 BiocInstaller_1.18.4 mclust_5.0.2 BiocParallel_1.2.9 acepack_1.3-3.3 VariantAnnotation_1.14.6 [43] RCurl_1.95-4.7 magrittr_1.5 Formula_1.2-1 futile.logger_1.4.1 Matrix_1.2-2 Rcpp_0.11.6 [49] munsell_0.4.2 proto_0.3-10 stringi_0.5-5 nleqslv_2.8 MASS_7.3-43 zlibbioc_1.14.0 [55] plyr_1.8.3 splines_3.2.1 multtest_2.24.0 annotate_1.46.1 beanplot_1.2 rngtools_1.2.4 [61] codetools_0.2-14 futile.options_1.0.0 XML_3.98-1.3 latticeExtra_0.6-26 lambda.r_1.1.7 gtable_0.1.2 [67] reshape_0.8.5 xtable_1.7-4 survival_2.38-3 OrganismDbi_1.10.0 GenomicAlignments_1.4.1 registry_0.3
Hi Kasper:
Just wanted to point out that you might want to update the
estimateCellCounts.Rd
file to provide the correct citation/reference info for the "Accounting for cellular heterogeneity is critical in epigenome-wide association studies" paper. Currently it says that the paper is "Under Review"