Question

Interpretation of voom limma's intercept coefficient

0

Entering edit mode

rubi ▴ 110

@rubi-6462

Last seen 6.5 years ago

Hi,

I have bacterial RNA-seq gene expression count data from from 4 different conditions (3 samples each), which differ from each other in the dosage of a certain gene which I manipulated.

Here's an example of the count data for three genes:

mat <- matrix(c(520,315,490,420,640,875,670,700,280,450,440,370,
5,6,1,3,10,6,10,5,2,7,4,1,
103230,112810,65160,96770,70830,108440,118970,98650,65500,117870,118630,78330),nrow=3)
colnames(mat) <- c("C_1","D_1","B_1","A_1","A_2","C_2","B_2","D_2","C_3","B_3","D_3","A_3")
rownames(mat) <- c("gene_1","gene_2","gene_3")

Here are the corresponding dosages (they are scaled such that the maximum is 1=100% and the minimum is 0=0%) of each sample:

dosage <- c(0.65,0,0.35,1,1,0.65,0.35,0,0.65,0.35,0,1)

And hence the design is:
design <- model.matrix(~dosage)

And then I fit voom:

dge <- DGEList(counts=mat)
dge.norm <- calcNormFactors(dge)
voom.norm <- voom(dge.norm,design,plot=TRUE)
voom.fit <- lmFit(voom.norm,design)
voom.fit <- eBayes(voom.fit)

My question is how to interpret the intercept of voom.fit$coefficients:

> voom.fit
An object of class "MArrayLM"
$coefficients
(Intercept) dosage
gene_1 18.53699 0.200114684
gene_2 18.45030 0.328356157
gene_3 18.13036 0.007383767

I take it that the dosage coefficient represent the log2 change in response (expression) to a unit of dosage, But the intercept values don't really seem like the mean log2 expression values.

Help would be appreciated,

rubi

Here's my sessionInfo:

> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages:
[1] grid stats graphics grDevices utils datasets methods base

other attached packages:
[1] data.table_1.9.6 piano_1.12.0 gridExtra_2.2.1 ggplot2_2.1.0 edgeR_3.14.0 limma_3.28.5

loaded via a namespace (and not attached):
[1] igraph_1.0.1 Rcpp_0.12.5 cluster_2.0.4 magrittr_1.5 BiocGenerics_0.18.0 munsell_0.4.3
[7] colorspace_1.2-6 plyr_1.8.4 caTools_1.17.1 tools_3.3.0 parallel_3.3.0 Biobase_2.32.0
[13] gtable_0.2.0 KernSmooth_2.23-15 marray_1.50.0 gtools_3.5.0 digest_0.6.9 sets_1.0-16
[19] bitops_1.0-6 slam_0.1-34 labeling_0.3 gdata_2.17.0 gplots_3.0.1 scales_0.4.0
[25] relations_0.6-6 chron_2.3-47

limma voom intercept coefficient • 1.3k views

ADD COMMENT • link updated 8.5 years ago by James W. MacDonald 67k • written 8.5 years ago by rubi ▴ 110

score 3 · Accepted Answer · 2016-07-25

Since you are fitting dosage as a continuous variable, the intercept is the 'usual' intercept from the formula for a line. It's the (expected, because there are errors) logCPM value for a particular gene when the dosage is zero.

> head(cpm(dge, log = T))
            C_1      D_1      B_1      A_1      A_2      C_2      B_2      D_2
gene_1 18.58216 17.72771 18.63134 18.44614 18.66854 17.26862 19.16603 19.15396
gene_2 17.85901 18.33538 18.69453 18.41372 18.93157 19.00557 18.16604 18.34661
gene_3 18.49643 18.78659 17.37262 18.16375 16.34664 18.26861 16.84413 16.34664
            C_3      B_3      D_3      A_3
gene_1 18.48584 18.41933 18.68075 18.51419
gene_2 18.61387 17.96913 18.41055 18.52347
gene_3 17.82204 18.58359 17.81973 17.92464