Search
Question: Interpretation of voom limma's intercept coefficient
0
gravatar for rubi
23 months ago by
rubi90
rubi90 wrote:

Hi,

I have bacterial RNA-seq gene expression count data from from 4 different conditions (3 samples each), which differ from each other in the dosage of a certain gene which I manipulated.

 

Here's an example of the count data for three genes:

mat <- matrix(c(520,315,490,420,640,875,670,700,280,450,440,370,
                5,6,1,3,10,6,10,5,2,7,4,1,
                103230,112810,65160,96770,70830,108440,118970,98650,65500,117870,118630,78330),nrow=3)
colnames(mat) <- c("C_1","D_1","B_1","A_1","A_2","C_2","B_2","D_2","C_3","B_3","D_3","A_3")
rownames(mat) <- c("gene_1","gene_2","gene_3")

 

Here are the corresponding dosages (they are scaled such that the maximum is 1=100% and the minimum is 0=0%) of each sample:

dosage <- c(0.65,0,0.35,1,1,0.65,0.35,0,0.65,0.35,0,1)

And hence the design is:
design <- model.matrix(~dosage)

And then I fit voom:

dge <- DGEList(counts=mat)
dge.norm <- calcNormFactors(dge)
voom.norm <- voom(dge.norm,design,plot=TRUE)
voom.fit <- lmFit(voom.norm,design)
voom.fit <- eBayes(voom.fit)

 

My question is how to interpret the intercept of voom.fit$coefficients:

> voom.fit
An object of class "MArrayLM"
$coefficients
       (Intercept)      dosage
gene_1    18.53699 0.200114684
gene_2    18.45030 0.328356157
gene_3    18.13036 0.007383767

I take it that the dosage coefficient represent the log2 change in response (expression) to a unit of dosage, But the intercept values don't really seem like the mean log2 expression values.

Help would be appreciated,

rubi

 

Here's my sessionInfo:

> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.6 piano_1.12.0     gridExtra_2.2.1  ggplot2_2.1.0    edgeR_3.14.0     limma_3.28.5    

loaded via a namespace (and not attached):
 [1] igraph_1.0.1        Rcpp_0.12.5         cluster_2.0.4       magrittr_1.5        BiocGenerics_0.18.0 munsell_0.4.3      
 [7] colorspace_1.2-6    plyr_1.8.4          caTools_1.17.1      tools_3.3.0         parallel_3.3.0      Biobase_2.32.0     
[13] gtable_0.2.0        KernSmooth_2.23-15  marray_1.50.0       gtools_3.5.0        digest_0.6.9        sets_1.0-16        
[19] bitops_1.0-6        slam_0.1-34         labeling_0.3        gdata_2.17.0        gplots_3.0.1        scales_0.4.0       
[25] relations_0.6-6     chron_2.3-47       

 

 

 

 

 

 

ADD COMMENTlink modified 23 months ago by James W. MacDonald46k • written 23 months ago by rubi90
3
gravatar for James W. MacDonald
23 months ago by
United States
James W. MacDonald46k wrote:

Since you are fitting dosage as a continuous variable, the intercept is the 'usual' intercept from the formula for a line. It's the (expected, because there are errors) logCPM value for a particular gene when the dosage is zero.

> head(cpm(dge, log = T))
            C_1      D_1      B_1      A_1      A_2      C_2      B_2      D_2
gene_1 18.58216 17.72771 18.63134 18.44614 18.66854 17.26862 19.16603 19.15396
gene_2 17.85901 18.33538 18.69453 18.41372 18.93157 19.00557 18.16604 18.34661
gene_3 18.49643 18.78659 17.37262 18.16375 16.34664 18.26861 16.84413 16.34664
            C_3      B_3      D_3      A_3
gene_1 18.48584 18.41933 18.68075 18.51419
gene_2 18.61387 17.96913 18.41055 18.52347
gene_3 17.82204 18.58359 17.81973 17.92464
ADD COMMENTlink written 23 months ago by James W. MacDonald46k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 193 users visited in the last hour