Question

Sample power selection for limma analysis in R

0

Entering edit mode

svlachavas ▴ 840

@svlachavas-7225

Last seen 7 months ago

Germany/Heidelberg/German Cancer Resear…

Hi to all,

i have preprocessed a large data set with 34 cel files and i want to compare some important factors implemented from the phenoData object. But one of my major concerns about using limma for more than one variable, is that one specific variable that i want to test regarding differential expression between metastatic and non-metastatic cancer samples, lacks appropriate sample comparison, as the metastatic cancer samples are only two and the non-metastatic cancer samples are 15. In other words, i believe that probably the statistical implementation is inappropriate because of the possible luck of statistical power, regarding the significantly fewer number of samples in the one condition. Also, is a multifactor experiment with two variables(including the one i mentioned called metastatic factor). Any suggestions or directions ?

limma linear model • 1.8k views

ADD COMMENT • link updated 11.1 years ago by Gordon Smyth 53k • written 11.1 years ago by svlachavas ▴ 840

score 2 · Answer 1 · 2015-01-19

2

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 4 hours ago

WEHI, Melbourne, Australia

limma is perfectly capable of correctly taking into account that you have fewer metastatic cancer samples than non-metastatic. So there is no statistical problem.

Of course, it would always be nice to have more samples, but you don't.

ADD COMMENT • link 11.1 years ago Gordon Smyth 53k

0

Entering edit mode

Yes, from literature i have read that is more appropriate for generally fewer replicates from the classic t-test. On the other hand, i have also noticed in some papers that in multifactorial designs, having a few number of replicates in one condition make it inappropriate for limma, due to the larger number of linear dependencies relative to the number of residuals in linear modeling. Should then i use some method or package (such as factDesign) to detect possible outliers ?

ADD REPLY • link 11.1 years ago svlachavas ▴ 840

0

Entering edit mode

I am not aware of any truth to your claims. limma fits factorial models in accordance with best practice, which coincides in mathematical terms with what factDesign or the lm() function in the base package do. limma is specifically designed for experiments with multiple factors, and the benefits of the limma approach generally become greater rather than less in complex models.

Both here and in your original question you are making claims that the limma package has inappropriate statistical practices. There is no truth to your accusations as far as I know, and your claims seem to relate to quite simple matters that linear modeling programs such as limma have no trouble with. If you wish to ask a question about a claim made in a particular paper that you have read, then please cite that paper and be specific about the claim and your question and what you are trying to achieve. Otherwise, making vague claims is rather unhelpful.

ADD REPLY • link 11.1 years ago Gordon Smyth 53k