finding DEG in a dataset with big sample size
Entering edit mode
Last seen 7.6 years ago


I want to know,what is the best way for finding DEG in a microarray dataset with 200 sample, I used limma for this but in bioconductor book mentioned that,when sample size are moderate or large,say ten or more in each group there is generally no advantage to using the Bayesian approach

now what is the best approach,tools,or package for my work?

limma • 831 views
Entering edit mode
Aaron Lun ★ 28k
Last seen 1 hour ago
The city by the bay

The real question is how many residual degrees of freedom you have in your model, rather than the number of samples. The residual d.f. determines how much information is available to estimate the variance - the more d.f., the more reliably you can estimate the variance of each gene. If you have lots of residual d.f., using empirical Bayes methods to share information between genes will not provide much benefit, because you can already estimate the variance fairly well using only the information for each gene by itself.

In your case, 200 samples is quite large, but if your model has 198 parameters, then you only have 2 residual d.f., in which case you would benefit from EB shrinkage. If your model has only 2 parameters, then you'd have 198 residual d.f., and the benefit of shrinkage would be lessened. However, it doesn't hurt to do EB shrinkage - there's just less benefit from doing so - so I would just continue using limma.


Login before adding your answer.

Traffic: 452 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6