Is it valid to use only genes of interest before eBayes step in limma?
Entering edit mode
minabashir • 0
Last seen 7.3 years ago

Hi all,

I used Agilent Microarrays to study gene expression, but actually I'm only interested in non-coding genes. When is the best time to get rid of all the coding genes? Currently, I did this after fitting the linear model, but before eBayes:

fit <- lmFit(E, design = design)
contr <- makeContrasts( A-B, C-B, A-C, levels=design)
fit <-, contrasts = contr)

nc <- fit[grepl("antisense|non-coding|pseudogene|vault|non-protein", fit$genes$Description), ]
nc.eb <- eBayes(nc), adjust = "BH", file="")

Is this valid? If not, how would you proceed.

Thank you for you help,


limma genefilter • 898 views
Entering edit mode
Aaron Lun ★ 28k
Last seen 2 hours ago
The city by the bay

The idea of the EB step is to share information across genes to estimate the variance. Even if you aren't (biologically) interested in protein-coding genes, they still provide some useful (statistical) information required for variance estimation. In contrast, if you only have 5 non-coding genes in your nc object, there's not a lot of information to share. Using all genes improves the reliability of the shrinkage statistics (i.e., the estimate of the prior variance and degrees of freedom) and of the downstream DE analysis.

In summary, I would only filter out uninteresting genes after eBayes. Then you can have your cake (reliable variance estimates) and eat it too (fewer tests during multiplicity correction). Note that you should have already filtered out low-abundance genes, as these won't provide much information for variance estimation anyway.


Login before adding your answer.

Traffic: 352 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6