Is it valid to use only genes of interest before eBayes step in limma?
1
0
Entering edit mode
minabashir • 0
@minabashir-11009
Last seen 8.5 years ago

Hi all,

I used Agilent Microarrays to study gene expression, but actually I'm only interested in non-coding genes. When is the best time to get rid of all the coding genes? Currently, I did this after fitting the linear model, but before eBayes:


fit <- lmFit(E, design = design)
contr <- makeContrasts( A-B, C-B, A-C, levels=design)
fit <- contrasts.fit(fit, contrasts = contr)

nc <- fit[grepl("antisense|non-coding|pseudogene|vault|non-protein", fit$genes$Description), ]
nc.eb <- eBayes(nc)

write.fit(nc.eb, adjust = "BH", file="fit.nc.txt")

Is this valid? If not, how would you proceed.

Thank you for you help,

Mina

limma genefilter • 1.2k views
ADD COMMENT
4
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 6 hours ago
The city by the bay

The idea of the EB step is to share information across genes to estimate the variance. Even if you aren't (biologically) interested in protein-coding genes, they still provide some useful (statistical) information required for variance estimation. In contrast, if you only have 5 non-coding genes in your nc object, there's not a lot of information to share. Using all genes improves the reliability of the shrinkage statistics (i.e., the estimate of the prior variance and degrees of freedom) and of the downstream DE analysis.

In summary, I would only filter out uninteresting genes after eBayes. Then you can have your cake (reliable variance estimates) and eat it too (fewer tests during multiplicity correction). Note that you should have already filtered out low-abundance genes, as these won't provide much information for variance estimation anyway.

ADD COMMENT

Login before adding your answer.

Traffic: 556 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6