Question: limma-voom when testing differential expression of a subset of genes.
2
4.3 years ago by
United Kingdom
ankur.chakravarthy.1040 wrote:

Dear list users,

I am trying to test if a given, predefined set of genes is differentially expressed between two groups in an RNA-seq experiment (count data, using voom) . I am therefore interested in only evaluating the hypothesis of differential expression within my set, so as to limit the burden of multiple-testing, for instance.

In light of this, which of the following is likely to be the most appropriate approach.

1) reduce the counts matrix to my gene set first, estimate mean-variance weights using voom, go on to use lmFit, eBayes and topTable to identify DEGs

or

2) model the mean-variance relationship across the whole matrix of counts with voom, then subset the voom object to only include my genes of interest, and then use lmFit, eBayes and topTable.

or

3) Carry out mean-variance modelling and shrinkage on the whole dataset, then subset the topTable to my gene set and re-estimate false discovery rates from the raw P values, since the only hypotheses I intended to test were those related to my subset of genes.

In terms of numbers, I am looking at a subset of 153 genes out of a total of 20,000.

Best wishes,

Ankur Chakravarthy

modified 4.3 years ago by Gordon Smyth38k • written 4.3 years ago by ankur.chakravarthy.1040
Answer: limma-voom when testing differential expression of a subset of genes.
3
4.3 years ago by
United States
James W. MacDonald51k wrote:

You should use #3, as both voom() and eBayes() are intended to use all available data to make their estimates. And do note that MArrayLM objects can be subsetted as if they were data.frames or matrices. So if you do

v <- voom(dglst.object)
fit <- lmFit(v, design)
fit2 <- contrasts.fit(fit, contrast) ## if necessary
fit2 <- eBayes(fit2)
fit2 <- fit2[<indicator variable to subset>,]
topTable(fit2, 1)

then the BH adjustment for the genes in your topTable will be based on just the subset of genes you care about.

Answer: limma-voom when testing differential expression of a subset of genes.
3
4.3 years ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

Just to add to Jim's answer, you should filter out unexpressed or very low expressed genes before running voom. So #3 is correct provided that the "whole dataset" is interpreted to mean all expressed genes.