Draw volcano plot by modifying contrast matrix (lumi package)
2
0
Entering edit mode
@marongiuluigi-7134
Last seen 9 months ago
European Union

Dear all,
I am new to the use of microarray analysis; I have been looking at the use of lumi for the analysis of BeadArray chips.
I have used the tutorial by Du, Feng, Kibbe and Lin (2010) (http://www.bioconductor.org/packages//2.7/bioc/vignettes/lumi/inst/doc/lumi.pdf) to learn the process. In that vignette, the authors use the packages lumi and limma to normalize the data and then fit the linear method (creating an object 'fit') that can identify the most expressed genes. The data come from the example.lumi dataset contained in the package lumiBarnes.
I then used the tutorial by Gillespie (http://bioinformatics.knowledgeblog.org/2011/06/21/volcano-plots-of-microarray-data/) to learn how to draw a volcano plot. It essentially plots the fold change against the -log of the p-value. When I applied this approach to the data of the lumi tutorial, however, I did not obtain the classical 'V' shaped profile but rather a '√' shape. This 'shape alteration' might be inherent to the data, but I reckon it codued be do to the fitting method applied. Both the fitting method and the plotting procedure are based on the topTable function and this is derived from the model.matrix function, thus if the analysis has produced the observed skewness in the data is the model.matrix that I need to modify. In the vignette by Gillespie, the contrast.matrix and contrasts.fit(fit, contrast.matrix) are used instead of model.matrix.
My questions are then: (a) how can I customize the contrast.matrix? (b) how can I draw a volcano plot of the data? (maybe I simply need to change the fitting method to obtain the expected 'V' shape)

Thank you,

The code follows.

###
# lumi analysis
library(lumi)
library(lumiBarnes)
library(limma)
data("example.lumi")
lumi_t <-lumiT(example.lumi)
lumi_n <- lumiN(lumi_t, method='rsn')
dataMatrix <- exprs(lumi_n)
probeList <- rownames(dataMatrix)
sampleType <- c('100:0', '95:5', '100:0', '95:5')
design <- model.matrix(~ factor(sampleType))
colnames(design) <- c('100:0', '95:5-100:0')
fit <- lmFit(dataMatrix, design)
fit <- eBayes(fit)
# print most expressed genes
# volcano plot
gene_list <- topTable(fit, coef='95:5-100:0', number=1000000, sort.by="logFC")
plot(gene_list$logFC, -log10(gene_list$P.Value))
volcanoplot design and contrast matrix • 1.0k views
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

The results will be identical if you fit the way you have, or if you fit a model that requires a contrasts matrix. Algebraically they are identical. But do note that you (usually) use model.matrix regardless.

design <- model.matrix(~ 0 + factor(sampleType))
contrast <- matrix(c(-1,1), ncol = 1)
fit <- lmFit(dataMatrix, design)
fit2 <- contrasts.fit(fit, contrast)
fit2 <- eBayes(fit2)
volcanoplot(fit2)

0
Entering edit mode
@gordon-smyth
Last seen 4 hours ago
WEHI, Melbourne, Australia

I think the asymmetric volcano plot is inherent to the data here, not a problem with the processing pipeline. Note that you could have made the volcano plot by

volcanoplot(fit, coef="95:5-100:0")

You might also like to look at:

plotMD(fit, coef="95:5-100:0")

which I think is a lot more informative. It would appear that there a number of genes that are almost absent at 100:0 but strongly expressed at 95:5.

BTW, this data has a clear batch effect (A vs B) that would have to be accounted for in the design matrix, if this was a real analysis.